Justin Ruggles
9d06037d48
twinvq: add SSE/AVX optimized sum/difference stereo interleaving
13 years ago
Diego Biurrun
ce33320b30
Remove redundant filename self-references inside files.
...
Filenames are brittle across renames and add no useful information.
13 years ago
Diego Biurrun
276b995d85
x86: drop pointless ARCH_X86 #ifdef from files in x86 subdirectory
13 years ago
Justin Ruggles
b8f02f5b4e
dsputil: use cpuflags in x86 versions of vector_clip_int32()
13 years ago
Ronald S. Bultje
717401aff2
h264_weight: remove duplication functions.
13 years ago
Justin Ruggles
5463e83dbc
fmtconvert: fix int32_to_float_fmul_scalar() for windows x86_64
...
The calling convention only allows 4 non-stack parameter, with each
float or int register being skipped if not used.
fixes Bug 64
13 years ago
Daniel Kang
ded3e9f054
H.264: Cometics to dsputil_mmx.c
...
Add whitespace.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
13 years ago
Ronald S. Bultje
b0b3231074
h264_weight: initialize "height" function argument properly.
...
Right now it's not actually initialized on 32-bit, leading to crashes
on win32.
13 years ago
Justin Ruggles
aad3429d4e
fmtconvert: port float_to_int16_interleave() 2-channel x86 inline asm to yasm
13 years ago
Justin Ruggles
4e8e262476
fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm
13 years ago
Justin Ruggles
185142a5ea
fmtconvert: check compile-time x86 instruction set flags
13 years ago
Justin Ruggles
708ab7dd69
fmtconvert: port float_to_int16() x86 inline asm to yasm
13 years ago
Ronald S. Bultje
c2d337429c
H264: change weight/biweight functions to take a height argument.
...
Neon parts by Mans Rullgard <mans@mansr.com>.
13 years ago
Ronald S. Bultje
229d263cc9
Support for lossless and inter H264 4:2:2.
13 years ago
Baptiste Coudurier
76741b0e56
h264: 4:2:2 intra decoding support
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
13 years ago
Diego Biurrun
265980dabc
x86: Move some variable declarations below the appropriat #ifdef.
...
This avoids some unused variable warnings with YASM disabled.
13 years ago
Diego Biurrun
2cb7c81669
x86: Fix linking of ProRes DSP ASM with YASM disabled.
13 years ago
Ronald S. Bultje
05c8f119cc
proresdsp: fix function prototypes.
...
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
13 years ago
Ronald S. Bultje
e3f530feca
prores: idct sse2/sse4 optimizations.
...
~3.0-3.5x as fast as original C version, 1.6x as fast overall.
13 years ago
Sean McGovern
c2d3f56107
fft: avoid a signed overflow
...
As a signed integer, 1<<31 overflows, so force it to unsigned.
Signed-off-by: Alex Converse <alex.converse@gmail.com>
13 years ago
Ronald S. Bultje
38e06c2969
Move clipd macros to x86util.asm.
...
This allows sharing them between multiple .asm files.
14 years ago
Dave Yeo
cc73511e8e
Fix NASM include directive
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Alex Converse
48f7163f13
dsputil_mmx: Honor HAVE_AMD3DNOW
14 years ago
Ronald S. Bultje
b2c087871d
Move x86util.asm from libavcodec/ to libavutil/.
...
This allows using it in swscale also.
14 years ago
Ronald S. Bultje
3a39195b1d
Move x86inc.asm to libavutil/.
...
This allows using it in libswscale/ also.
14 years ago
Kostya Shishkov
d241f51e0f
Move RV3/4-specific DSP functions into their own context
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Vitor Sessak
18b131de04
dct32: Add SSE2 ASM optimizations
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Jason Garrett-Glaser
a3bf7b864a
H.264: tweak some other x86 asm for Atom
14 years ago
Mans Rullgard
3ad1684126
x86: cabac: add operand size suffixes missing from 6c32576
...
This fixes build with clang.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
f5f004bc5a
x86: cabac: don't load/store context values in asm
...
Inspection of compiled code shows gcc handles these fine on its own.
Benchmarking also shows no measurable speed difference.
Removing the remaining cases in get_cabac_bypass_sign_x86() does
cause more substantial changes to the compiled code with uncertain
impact.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Jason Garrett-Glaser
6c32576548
H.264: optimize CABAC x86 asm for Atom
14 years ago
Mans Rullgard
da4c7cce21
x86: fix build with gcc 4.7
...
The upcoming gcc 4.7 has more advanced constant propagation
resulting some inline asm operands becoming constants and thus
emitted as literals, sometimes in contexts where this results
in invalid instructions.
This patch changes the constraints of the relevant operands
to "rm" thus forcing a valid type. While obviously suboptimal,
this is what older gcc versions already did, and there is no
change to the code generated with these.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Daniel Kang
406fbd24dc
H.264: Add optimizations to predict x86 assembly.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Joseph Artsimovich
5ab21439fd
dnxhd: 10-bit support
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
a617c6aaa3
dsputil: update per-arch init funcs for non-h264 high bit depth
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
874f1a901d
dsputil: template get_pixels() for different bit depths
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
0a72533e98
jfdctint: add 10-bit version
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Mans Rullgard
e7a972e113
simple_idct: add 10-bit version
...
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Diego Biurrun
65083b4911
dsputil: remove disabled code
14 years ago
Martin Storsjö
8f62ef0f95
x86: Use LOCAL_ALIGNED in mpegvideo_mmx_template
...
Signed-off-by: Martin Storsjö <martin@martin.st>
14 years ago
Diego Biurrun
e0ae2174db
simple_idct: remove disabled code
14 years ago
Daniel Kang
ac4a85f476
H.264: Add more x86 assembly for 10-bit H.264 predict functions
...
Mainly ported from 8-bit H.264 predict.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Jason Garrett-Glaser
b5bbc84fe2
H.264: add filter_mb_fast support for >8-bit decoding
...
Much faster high bit depth deblocking.
14 years ago
Mans Rullgard
710b8df949
dsputil: remove ff_emulated_edge_mc macro used in one place
...
This macro can cause problems in conjunction with the bitdepth
template expansion. It was presumably added to keep source
compatibility when high bitdepth support was added. However,
emulated_edge_mc is a dsputil pointer and should not be called
directly, so there is little reason to keep such a macro.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Daniel Kang
c0483d0c7a
H.264: Add x86 assembly for 10-bit H.264 predict functions
...
Mainly ported from 8-bit H.264 predict.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Daniel Kang
3c7c16fde3
YASM: Shut up unused variable compiler warning with --disable-yasm.
...
Signed-off-by: Diego Biurrun <diego@biurrun.de>
14 years ago
Daniel Kang
567a32b5b2
x86_32: Fix build on x86_32 with --disable-yasm.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Daniel Kang
58f7aad051
Fix build with --disable-yasm.
...
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Daniel Kang
9bfa5363da
H.264: Add x86 assembly for 10-bit H.264 qpel functions.
...
Mainly ported from 8-bit H.264 qpel.
Some code ported from x264. LGPL ok by author.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Justin Ruggles
f99a5ef92e
ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents().
14 years ago