Diego Biurrun
b78b10c4b7
avutil: Move internal CPU detection function declarations to private header
11 years ago
Diego Biurrun
3ac7fa81b2
Consistently use "cpu_flags" as variable/parameter name for CPU flags
12 years ago
Michael Niedermayer
a478e99a60
avutil/x86: reenable ff_update_lls_avx()
...
The bug has been fixed in c8b920a9b7
by Loren Merritt
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Loren Merritt
c8b920a9b7
lls/x86: use 3-operator vaddpd in ADDPD_MEM
...
Fixes build with yasm-1.1
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Michael Niedermayer
a6e46ed51a
Revert "avutil/x86: disable ff_evaluate_lls_sse2() for 32bit"
...
This reverts commit 247425241c
.
12 years ago
Loren Merritt
1221bb6239
x86: lpc: fix a segfault in av_evaluate_lls_sse2()
12 years ago
Michael Niedermayer
247425241c
avutil/x86: disable ff_evaluate_lls_sse2() for 32bit
...
It just segfaults on 32bit, thus its disabled until someone fixes it.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
a285079bc7
lls.asm: disable ff_update_lls_avx
...
The code doesnt build with yasm from ubuntu 12.04
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
0b40c50508
lls.asm: put avx code under if HAVE_AVX_EXTERNAL
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Loren Merritt
b545179fdf
x86: lpc: simd av_evaluate_lls
...
1.5x-1.8x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Loren Merritt
502ab21af0
x86: lpc: simd av_update_lls
...
4x-6x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Diego Biurrun
1fda184a85
avutil: Add av_cold attributes to init functions missing them
12 years ago
Christophe Gisquet
566b7a20fd
x86: float dsp: butterflies_float SSE
...
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
12 years ago
Michael Niedermayer
92218aad00
butterflies_float: replace 2 lea by 2 add
...
adds are simpler instructions and should be faster or equally fast
on all cpus
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Christophe Gisquet
1a4007964c
x86: float dsp: butterflies_float SSE
...
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
b93b27edb0
dsputil: Make dsputil selectable
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Christophe Gisquet
2e81acc687
x86inc: Fix number of operands for cmp* instructions
...
cmp{p,s}{s,d} instructions do take an imm8 operand.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Christophe Gisquet
0b467a6e83
x264asm: fix cmp* number of arguments
...
cmp{p,s}{s,d} instructions do take an imm8 operand.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Diego Biurrun
b6649ab503
cosmetics: Remove unnecessary extern keywords from function declarations
12 years ago
Ronald S. Bultje
6a701306db
dsputil: make selectable.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
0c0828ecc5
x86: Use simple nop codes for <= sse (rather than <= mmx)
...
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
James Almer
a56fd9edab
lavu: Fix checkheaders for x86/emms.h
...
internal.h doesn't need to include cpu.h anymore since
the relevant code was moved to x86/emms.h
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Diego Biurrun
4db96649ca
avutil: Ensure that emms_c is always defined, even on non-x86
12 years ago
Diego Biurrun
ab441e20ff
avutil: Move emms code to x86-specific header
12 years ago
Ronald S. Bultje
b582af1ed7
Use simple nop codes for <= sse (rather than <= mmx).
...
The "CPU: CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
Change-Id: I7e7c52a2191006df30a9aadbc40d481a1db89106
12 years ago
Ronald S. Bultje
42d3246948
floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
...
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
12 years ago
Ronald S. Bultje
55aa03b9f8
floatdsp: move vector_fmul_add from dsputil to avfloatdsp.
12 years ago
Ronald S. Bultje
d56668bd80
floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
...
This makes the aac decoder and all voice codecs independent of dsputil.
12 years ago
Martin Storsjö
f4facd2ce7
x86: Add a Yasm-based emms() replacement
...
This provides a fallback when building with Yasm enabled, but neither
inline assembly, nor the _mm_empty intrinsic are available or enabled.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Diego Biurrun
d633d12b2c
x86inc: Add cvisible macro for C functions with public prefix
...
This allows defining externally visible library symbols.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Diego Biurrun
ef5d41a553
x86inc: Rename "program_name" to "private_prefix"
...
The new name is more descriptive and will allow defining a separate
public prefix for externally visible library symbols.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Martin Storsjö
973b4d44f1
float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
...
This fixes builds on 64bit MSVC.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Justin Ruggles
e034cc6c60
lavc: Move vector_fmul_window to AVFloatDSPContext
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Diego Biurrun
dae1d507af
x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags
12 years ago
Diego Biurrun
320e1d0df3
x86: ABSB2: port to cpuflags
12 years ago
Diego Biurrun
094a7405e5
x86: ABSB: port to cpuflags
12 years ago
Diego Biurrun
51969a652c
x86: ABS2: port to cpuflags
12 years ago
Diego Biurrun
5b4dfbffc2
x86: ABS1: port to cpuflags
12 years ago
Ronald S. Bultje
a34d9ad969
lavc: merge latest x86inc.asm fixes with x264
...
Unbreak NASM support.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Janne Grunau
0995ad8db4
x86inc: fully concatenate tokens to fix macro expansion for nasm
...
Fixes build errors with nasm introduced in 6f40e9f070
for stack
memory alignment. Noticed by BugMaster.
12 years ago
Ronald S. Bultje
140367aff9
x86inc: fix stack alignment on win64
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Ronald S. Bultje
ce58642ed0
x86inc: support stack mem allocation and re-alignment in PROLOGUE.
...
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
6f40e9f070
x86inc: support stack mem allocation and re-alignment in PROLOGUE
...
Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Justin Ruggles
1c012e6bfb
x86: float_dsp: fix loading of the len parameter on x86-32
12 years ago
Justin Ruggles
ecc8b02194
x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32
...
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
12 years ago
Justin Ruggles
b30a363331
x86: af_volume: add SSE2/SSSE3/AVX-optimized s32 volume scaling
12 years ago
Justin Ruggles
ac7eb4cb20
float_dsp: add vector_dmul_scalar() to multiply a vector of doubles
...
Include x86-optimized versions for SSE2 and AVX.
12 years ago
Diego Biurrun
490df522c7
x86: cpu: Drop unused HAVE_RWEFLAGS condition
...
The test for rweflags was dropped in a previous commit.
12 years ago
Justin Ruggles
947f933687
x86: float_dsp: add SSE version of vector_fmul_scalar()
12 years ago
Diego Biurrun
87af05c575
x86: SPLATD: port to cpuflags
12 years ago