James Almer
76ed71a72b
x86: move horizontal add macros to x86util
...
Also port relevant AVX2/XOP optimizations from x264 with permission
to relicense to LGPL from the corresponding authors
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
James Almer
3f3d748cab
x86: Move XOP emulation to x86util
...
We need the emulation to support the cases where the first
argument is the same as the fourth. To achieve this a fifth
argument working as a temporary may be needed.
Emulation that doesn't obey the original instruction semantics
can't be in x86inc.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
11 years ago
Jason Garrett-Glaser
c6908d6b4b
x86inc: FMA3/4 Support
...
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Derek Buitenhuis
206895708e
x86inc: Remove our FMA4 support
...
This is so we can sync to x264's version of FMA4 support.
This partialy reverts commit 79687079a9
.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
11 years ago
Diego Biurrun
d633d12b2c
x86inc: Add cvisible macro for C functions with public prefix
...
This allows defining externally visible library symbols.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Diego Biurrun
ef5d41a553
x86inc: Rename "program_name" to "private_prefix"
...
The new name is more descriptive and will allow defining a separate
public prefix for externally visible library symbols.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Diego Biurrun
dae1d507af
x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags
12 years ago
Diego Biurrun
320e1d0df3
x86: ABSB2: port to cpuflags
12 years ago
Diego Biurrun
094a7405e5
x86: ABSB: port to cpuflags
12 years ago
Diego Biurrun
51969a652c
x86: ABS2: port to cpuflags
12 years ago
Diego Biurrun
5b4dfbffc2
x86: ABS1: port to cpuflags
12 years ago
Justin Ruggles
ac7eb4cb20
float_dsp: add vector_dmul_scalar() to multiply a vector of doubles
...
Include x86-optimized versions for SSE2 and AVX.
12 years ago
Diego Biurrun
87af05c575
x86: SPLATD: port to cpuflags
12 years ago
Diego Biurrun
26301caaa1
x86: mmx2 ---> mmxext in asm constructs
12 years ago
Diego Biurrun
f0d124f005
x86inc: Set program_name outside of x86inc.asm
...
This reduces the local difference to the x264 upstream version.
12 years ago
Diego Biurrun
4b60fac419
x86: PALIGNR: port to cpuflags
12 years ago
Diego Biurrun
dbb37e7711
x86: PABSW: port to cpuflags
12 years ago
Diego Biurrun
0a7a94f2e5
x86: Refactor PSWAPD fallback implementations and port to cpuflags
12 years ago
Diego Biurrun
26f01bd106
x86: PMINUB: port to cpuflags
12 years ago
Diego Biurrun
61bc2bc7d4
x86util: Add cpuflags_mmxext alias for cpuflags_mmx2
...
"mmxext" is a more sensible name and more common in outside projects.
12 years ago
Dave Yeo
264f12342c
x86: Fix assembly with NASM
...
Unlike YASM, NASM only looks for include files in the current
directory, not in the directory that included files reside in.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Dave Yeo
9c167914a1
x86: Fix assembly with NASM
...
Unlike YASM, NASM only looks for include files in the current
directory, not in the directory that included files reside in.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
12 years ago
Diego Biurrun
588fafe7f3
x86: MMX2 ---> MMXEXT in macro names
12 years ago
Diego Biurrun
6860b4081d
x86: include x86inc.asm in x86util.asm
...
This is necessary to allow refactoring some x86util macros with cpuflags.
12 years ago
Justin Ruggles
6092dafb5a
lavr: x86: optimized 6-channel s16 to fltp conversion
13 years ago
Jason Garrett-Glaser
85a3c19ed1
dsputil: x86: add SHUFFLE_MASK_W macro
...
Simplifies pshufb masks that operate on words.
13 years ago
Loren Merritt
4d4752366f
x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
13 years ago
Vitor Sessak
4a301706fd
x86: Avoid movs on BUTTERFLYPS when in AVX mode
...
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
13 years ago
Justin Ruggles
5cc6d5244d
lavr: replace the SSE version of ff_conv_fltp_to_flt_6ch() with SSE4 and AVX
...
The current SSE version is slower than the MMX version on Athlon64 and Sandy
Bridge, but the SSE4 and AVX versions are faster on Sandy Bridge.
13 years ago
Justin Ruggles
c8af852b97
Add libavresample
...
This is a new library for audio sample format, channel layout, and sample rate
conversion.
13 years ago
Ronald S. Bultje
3b15a6d742
config.asm: change %ifdef directives to %if directives.
...
This allows combining multiple conditionals in a single statement.
13 years ago
Justin Ruggles
4e8e262476
fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm
13 years ago
Ronald S. Bultje
38e06c2969
Move clipd macros to x86util.asm.
...
This allows sharing them between multiple .asm files.
14 years ago
Ronald S. Bultje
b2c087871d
Move x86util.asm from libavcodec/ to libavutil/.
...
This allows using it in swscale also.
14 years ago
Jason Garrett-Glaser
a3bf7b864a
H.264: tweak some other x86 asm for Atom
14 years ago
Daniel Kang
c0483d0c7a
H.264: Add x86 assembly for 10-bit H.264 predict functions
...
Mainly ported from 8-bit H.264 predict.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Loren Merritt
422b2362fc
dct32_sse: eliminate some spills
...
125->104 cycles on penryn (x86_64 only)
14 years ago
Daniel Kang
d0005d347d
Modify x86util.asm to ease transitioning to 10-bit H.264 assembly.
...
Arguments for variable size instructions are added to many macros, along
with other various changes. The x86util.asm code was ported from x264.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
14 years ago
Diego Biurrun
888fa31eca
Fix FSF address copy paste error in some license headers.
14 years ago
Jason Garrett-Glaser
9f3d6ca4f1
Port x86 10-bit H.264 deblock asm from x264
14 years ago
Jason Garrett-Glaser
8ad77b65b5
Update x86 H.264 deblock asm
...
Includes AVX versions from x264.
14 years ago
Mans Rullgard
2912e87a6c
Replace FFmpeg with Libav in licence headers
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Justin Ruggles
a30ac54a19
Add x86-optimized versions of exponent_min().
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
(cherry picked from commit dda3f0ef48
)
14 years ago
Justin Ruggles
dda3f0ef48
Add x86-optimized versions of exponent_min().
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Ronald S. Bultje
e2e341048e
Move hadamard_diff{,16}_{mmx,mmx2,sse2,ssse3}() from inline asm to yasm,
...
which will hopefully solve the Win64/FATE failures caused by these functions.
Originally committed as revision 25137 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
David Conrad
faa26db28b
MMX/SSE VC1 loop filter
...
Originally committed as revision 24208 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Ronald S. Bultje
f2a30bd840
Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).
...
Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Ronald S. Bultje
2dd2f71692
MMX idct_add for VP8.
...
Originally committed as revision 23886 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Jason Garrett-Glaser
37355fe823
Make x86util.asm LGPL so we can use it in LGPL asm
...
Strip out most x264-specific stuff (not used anywhere in ffmpeg).
Originally committed as revision 23877 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Jason Garrett-Glaser
2966cc1849
Update x264asm header files to latest versions.
...
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.
Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago