Loren Merritt
3fb78e99a0
x86inc: create xm# and ym#, analagous to m#
...
For when we want to mix simd sizes within one function.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
12 years ago
Loren Merritt
49ebe3f9fe
x86inc: fix some corner cases of SWAP
...
SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
12 years ago
Henrik Gramner
63f0d62310
x86inc: Use SSE instead of SSE2 for copying data
...
Reduces code size because movaps/movups is one byte
shorter than movdqa/movdqu.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
12 years ago
Henrik Gramner
ad76e6e7e1
x86inc: Set ELF hidden visibility for global constants
...
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
12 years ago
Loren Merritt
25cb0c1a1e
x86inc: activate REP_RET automatically
...
Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.
The implementation involves lots of spurious labels, but that's OK
because we strip them.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
12 years ago
Ronald S. Bultje
c07ac8d467
VP9 MC (ssse3) optimizations.
...
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
12 years ago
Alex Smith
08fa828b3f
avutil: Fix compilation with inline asm disabled on mingw
...
Because of -Werror=implicit-function-declaration the build will fail.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Thilo Borgmann
d814a839ac
Reinstate proper FFmpeg license for all files.
12 years ago
Diego Biurrun
79aec43ce8
x86: Add and use more convenience macros to check CPU extension availability
12 years ago
Diego Biurrun
8410d6e93c
avutil: Refactor CPU extension availability macros
12 years ago
Diego Biurrun
b78b10c4b7
avutil: Move internal CPU detection function declarations to private header
12 years ago
Diego Biurrun
3ac7fa81b2
Consistently use "cpu_flags" as variable/parameter name for CPU flags
12 years ago
Michael Niedermayer
a478e99a60
avutil/x86: reenable ff_update_lls_avx()
...
The bug has been fixed in c8b920a9b7
by Loren Merritt
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Loren Merritt
c8b920a9b7
lls/x86: use 3-operator vaddpd in ADDPD_MEM
...
Fixes build with yasm-1.1
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Michael Niedermayer
a6e46ed51a
Revert "avutil/x86: disable ff_evaluate_lls_sse2() for 32bit"
...
This reverts commit 247425241c
.
12 years ago
Loren Merritt
1221bb6239
x86: lpc: fix a segfault in av_evaluate_lls_sse2()
12 years ago
Michael Niedermayer
247425241c
avutil/x86: disable ff_evaluate_lls_sse2() for 32bit
...
It just segfaults on 32bit, thus its disabled until someone fixes it.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
a285079bc7
lls.asm: disable ff_update_lls_avx
...
The code doesnt build with yasm from ubuntu 12.04
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Michael Niedermayer
0b40c50508
lls.asm: put avx code under if HAVE_AVX_EXTERNAL
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Loren Merritt
b545179fdf
x86: lpc: simd av_evaluate_lls
...
1.5x-1.8x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Loren Merritt
502ab21af0
x86: lpc: simd av_update_lls
...
4x-6x faster on sandybridge
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Diego Biurrun
1fda184a85
avutil: Add av_cold attributes to init functions missing them
12 years ago
Christophe Gisquet
566b7a20fd
x86: float dsp: butterflies_float SSE
...
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
12 years ago
Michael Niedermayer
92218aad00
butterflies_float: replace 2 lea by 2 add
...
adds are simpler instructions and should be faster or equally fast
on all cpus
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Christophe Gisquet
1a4007964c
x86: float dsp: butterflies_float SSE
...
97c -> 49c
Some codecs could benefit from more unrolling, but AAC doesn't.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
b93b27edb0
dsputil: Make dsputil selectable
...
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Christophe Gisquet
2e81acc687
x86inc: Fix number of operands for cmp* instructions
...
cmp{p,s}{s,d} instructions do take an imm8 operand.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Christophe Gisquet
0b467a6e83
x264asm: fix cmp* number of arguments
...
cmp{p,s}{s,d} instructions do take an imm8 operand.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Diego Biurrun
b6649ab503
cosmetics: Remove unnecessary extern keywords from function declarations
12 years ago
Ronald S. Bultje
6a701306db
dsputil: make selectable.
...
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Ronald S. Bultje
0c0828ecc5
x86: Use simple nop codes for <= sse (rather than <= mmx)
...
The "CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
James Almer
a56fd9edab
lavu: Fix checkheaders for x86/emms.h
...
internal.h doesn't need to include cpu.h anymore since
the relevant code was moved to x86/emms.h
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
12 years ago
Diego Biurrun
4db96649ca
avutil: Ensure that emms_c is always defined, even on non-x86
12 years ago
Diego Biurrun
ab441e20ff
avutil: Move emms code to x86-specific header
12 years ago
Ronald S. Bultje
b582af1ed7
Use simple nop codes for <= sse (rather than <= mmx).
...
The "CPU: CentaurHauls family 6 model 9 stepping 8" family of CPUs
(flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse
up rng rng_en ace ace_en) SIGILLs on long nop codes.
Change-Id: I7e7c52a2191006df30a9aadbc40d481a1db89106
12 years ago
Ronald S. Bultje
42d3246948
floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.
...
Now, nellymoserenc and aacenc no longer depends on dsputil. Independent
of this patch, wmaprodec also does not depend on dsputil, so I removed
it from there also.
12 years ago
Ronald S. Bultje
55aa03b9f8
floatdsp: move vector_fmul_add from dsputil to avfloatdsp.
12 years ago
Ronald S. Bultje
d56668bd80
floatdsp: move scalarproduct_float from dsputil to avfloatdsp.
...
This makes the aac decoder and all voice codecs independent of dsputil.
12 years ago
Martin Storsjö
f4facd2ce7
x86: Add a Yasm-based emms() replacement
...
This provides a fallback when building with Yasm enabled, but neither
inline assembly, nor the _mm_empty intrinsic are available or enabled.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Diego Biurrun
d633d12b2c
x86inc: Add cvisible macro for C functions with public prefix
...
This allows defining externally visible library symbols.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Diego Biurrun
ef5d41a553
x86inc: Rename "program_name" to "private_prefix"
...
The new name is more descriptive and will allow defining a separate
public prefix for externally visible library symbols.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Martin Storsjö
973b4d44f1
float_dsp: Add #ifdef HAVE_INLINE_ASM around vector_fmul_window
...
This fixes builds on 64bit MSVC.
Signed-off-by: Martin Storsjö <martin@martin.st>
12 years ago
Justin Ruggles
e034cc6c60
lavc: Move vector_fmul_window to AVFloatDSPContext
...
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Diego Biurrun
dae1d507af
x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags
12 years ago
Diego Biurrun
320e1d0df3
x86: ABSB2: port to cpuflags
12 years ago
Diego Biurrun
094a7405e5
x86: ABSB: port to cpuflags
12 years ago
Diego Biurrun
51969a652c
x86: ABS2: port to cpuflags
12 years ago
Diego Biurrun
5b4dfbffc2
x86: ABS1: port to cpuflags
12 years ago
Ronald S. Bultje
a34d9ad969
lavc: merge latest x86inc.asm fixes with x264
...
Unbreak NASM support.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
12 years ago
Janne Grunau
0995ad8db4
x86inc: fully concatenate tokens to fix macro expansion for nasm
...
Fixes build errors with nasm introduced in 6f40e9f070
for stack
memory alignment. Noticed by BugMaster.
12 years ago