James Almer
edff061fb0
x86/swr: add ff_float_to_int32_a_avx2
...
13797 decicycles in ff_float_to_int32_a_sse2, 32768 runs, 0 skips
8603 decicycles in ff_float_to_int32_a_avx2, 32766 runs, 2 skips
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
b385c4c6a3
x86/swr: replace sse4 instructions in pack_6ch with sse ones
...
There's no benefit from using blendps here except on CPUs with AVX, where
it's faster than shufps according to Intel's documentation.
As such, rename the sse4 functions to sse/sse2 and use shufps instead.
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
10 years ago
James Almer
9937362c54
x86/swr: use lavu helper macros to check CPU extensions
...
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
8279a15284
x86/swr: split audioconvert and rematrix DSP into separate files
...
Also rename resample_x86_dsp.c to resample_init.c
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Reimar Döffinger
cbeaf67888
Avoid using empty macro arguments.
...
These are not supported by all compilers (gcc 2.95 but also older SPARC
compilers, see gcc bug #33304 for example), and there is no real need for them.
One use of this feature remains in libavdevice/v4l2.c which can't be
replaced quite as easily.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
11 years ago
Michael Niedermayer
4cfc92081d
swr: add native_simd_one
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
31a797eb28
swr: add av_cold to init/free functions
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Carl Eugen Hoyos
a26789cf9f
Fix compilation with yasm-0.6.2.
12 years ago
Michael Niedermayer
68712ce820
swr/x86: 16bit integer mix functions need SSE2 not SSE
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
c88e60af76
swr/x86: 10l, missed some SSE2 instructions in code marked as SSE.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
728f86edfc
swr: mix_2_1_int16_mmx/sse
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
d504266cef
swr: mix_1_1_int16_sse
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
cbeeaf2593
swr: mix_1_1 int16 MMX
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
52afa43691
swr: mix_2_1_float SSE/AVX
...
Based-on code by Justin Ruggles
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
beb0cd6acf
swr: SIMD rematrixing and SSE/AVX mix_1_1 float
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
a927641e7a
libswresample-simd: Add ff_pack_6ch_float_to_int32_a_avx and ff_pack_6ch_float_to_int32_a_sse4
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
ca986a06ad
libswresample-simd: add ff_pack_6ch_int32_to_float_a_avx and ff_pack_6ch_int32_to_float_a_sse4
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Justin Ruggles
6f67d9833b
libswresample: Implement MMX, SSE4 and AVX 6ch float and int32 packing function.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
cbbc472467
swr-x86-simd: add ff_unpack_2ch_int16_to_int16/int32/float_a_ssse3
...
more than 10% faster (tested on sandybridge)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
72ae583b7d
swr-x86-simd: stereo unpack S16/S32/FLT-> S16/S32/FLT SSE/SSE2 (16 new SIMD functions)
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
11ad5f0d7d
swr-x86-simd: create prototypes with macros, this is simpler.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
47055b8913
swr: implement stereo S16/S32/FLT->S16/S32/FLT planar->packed in SSE/SSE2
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
c1fe2db376
swr: add ff_int32_to_float_a_avx
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
65722e7fc5
swr: int32_to_int16_mmx/sse
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
73edb58c3c
swr: float_to_int16_sse2()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
5932938c9a
swr: float_to_int32_sse2()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
b72a0f9c23
swr: add int16_to_float_sse2()
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
832c3b10d2
swr: add int32_to_float_sse2
...
could be done for sse/3dnow too if someone wants
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
fa5daaca0d
swr: seperate functions for aligned & unaligned
...
If someone has an idea on how to do this cleaner, its welcome
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago
Michael Niedermayer
bcc66ff0e4
swr: add int16_to_int32_mmx/sse
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
13 years ago