Clément Bœsch
|
b12a36170b
|
lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis
|
8 years ago |
James Almer
|
8bb59e6742
|
x86/aacpsdsp: add ff_ps_hybrid_analysis_ileave_sse
About 2x faster than the c version.
|
8 years ago |
James Almer
|
e229df9478
|
x86/aacpsdsp: add ff_ps_hybrid_synthesis_deint_{sse,sse4}
About 2x faster than the c version.
|
8 years ago |
James Almer
|
623d217ed1
|
avcodec/aacps: move checks for valid length outside the stereo_interpolate dsp function
Signed-off-by: James Almer <jamrial@gmail.com>
|
8 years ago |
James Almer
|
497a4b554c
|
x86/aacpsdsp: fix output of ff_ps_stereo_interpolate_ipdopd_sse3
The fate-aac-al_sbr_ps_04_ur test did not detect this mistake.
|
8 years ago |
James Almer
|
933dd62288
|
x86/aacpsdsp: optimize ff_ps_mul_pair_single_sse
~2% faster.
|
8 years ago |
James Almer
|
be3809a521
|
x86/aacpsdsp: optimize ff_ps_stereo_interpolate_sse3
Move the unpacking outside of the loop. 5% to 10% faster.
Suggested-by: ubitux
Signed-off-by: James Almer <jamrial@gmail.com>
|
8 years ago |
James Almer
|
b5a0971ff0
|
x86/aacps: add ff_ps_stereo_interpolate_ipdopd_sse3()
About 2x faster than the c version.
Signed-off-by: James Almer <jamrial@gmail.com>
|
8 years ago |
James Almer
|
ede4ec1f8f
|
x86/aacpsdsp: optimize add_squares loop
Signed-off-by: James Almer <jamrial@gmail.com>
|
9 years ago |
James Almer
|
82dbfccaf0
|
x86/aacdec: use HADDPS macro
Signed-off-by: James Almer <jamrial@gmail.com>
|
9 years ago |
Henrik Gramner
|
f0b7882ceb
|
x86inc: Drop SECTION_TEXT macro
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
|
10 years ago |
James Almer
|
9dcaae70f2
|
x86/aacpsdsp: add SSE and SSE3 optimized functions
Between 1.5 and 2.5 times faster
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
|
10 years ago |