Eli Friedman
c12d6955e2
H.264: SSE2/SSSE3 weighted prediction asm
...
Patch by Eli Friedman <eli.friedman at gmail dot com>
Originally committed as revision 24702 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Måns Rullgård
f079a64aea
Move cavs dsp functions to their own struct
...
Originally committed as revision 24685 to svn://svn.ffmpeg.org/ffmpeg/trunk
14 years ago
Loren Merritt
c7b1d9768c
relicense h264 deblock sse2 to lgpl
...
Originally committed as revision 24408 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
David Conrad
c7eec58170
Move ff_pw_* from vc1dsp_mmx.c to dsputil_mmx.c
...
Should fix compilation with icc and should help prevent any future duplicates
Originally committed as revision 24380 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Ronald S. Bultje
e9e456d850
VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16)
...
and chroma (width=8).
Originally committed as revision 24378 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Ronald S. Bultje
a711eb4829
VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations.
...
Originally committed as revision 24250 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
David Conrad
7af8fbd348
Make ff_pw_4 128 bits
...
Originally committed as revision 24207 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Ronald S. Bultje
f2a30bd840
Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).
...
Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Eli Friedman
b3858964d6
Add const to some pointer parameters.
...
Patch by Eli Friedman, eli D friedman A gmail
Originally committed as revision 23826 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Jason Garrett-Glaser
4af8cdfc3f
16x16 and 8x8c x86 SIMD intra pred functions for VP8 and H.264
...
Originally committed as revision 23783 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
David Conrad
413abbe164
Add bitexact versions of put_no_rnd_pixels8 _x2 and _y2 for vp3/theora
...
Originally committed as revision 23463 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
David Conrad
eb6a6cd788
vp3: DC-only IDCT
...
2-4% faster overall decode
Originally committed as revision 22896 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Måns Rullgård
4693b031a3
Move H264 dsputil functions into their own struct
...
This moves the H264-specific functions from DSPContext to the new
H264DSPContext. The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.
The qpel and chroma MC functions are not moved as these are used by
non-h264 code.
Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Måns Rullgård
05aec7bb87
Separate DWT from snow and dsputil
...
This moves the DWT functions from snow.c and dsputil.c to a file of
their own. A new struct, DWTContext, holds the function pointers
previously part of DSPContext.
Originally committed as revision 22522 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Måns Rullgård
f49747e904
x86: move function prototypes to header files
...
Originally committed as revision 22266 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Måns Rullgård
84dc2d8afa
Remove DECLARE_ALIGNED_{8,16} macros
...
These macros are redundant. All uses are replaced with the generic
DECLARE_ALIGNED macro instead.
Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
David Conrad
19530266a5
Enable SSE2 (put|avg)_pixels_16_sse2
...
SVQ1 chroma has been special-cased aligned to 16-bytes since at least r15466
Other architectures also assume 16-byte alignment here too but set STRIDE_ALIGN
to 16.
Originally committed as revision 21736 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Alex Converse
3deb53849e
Implement an sse version of scalarproduct_float().
...
Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Måns Rullgård
c67278098d
Move array specifiers outside DECLARE_ALIGNED() invocations
...
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Gwenole Beauchesne
5716aec3f9
Fix XvMC. XvMCCreateBlocks() may not allocate 16-byte aligned blocks,
...
so we can't use SSE-optimized routines.
Originally committed as revision 21011 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Diego Biurrun
4052cbf161
Get rid of pointless CONFIG_ANY_H263 preprocessor definition.
...
Originally committed as revision 20975 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
91e644ff77
r20739 broke compilation on systems without yasm
...
Originally committed as revision 20742 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
b1159ad928
refactor and optimize scalarproduct
...
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.
Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
b10fa1bb8b
port ape dsp functions from sse2 to mmx
...
now requires yasm
Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
e17ccf60fe
huffyuv: add some const qualifiers
...
Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
2f77923d72
simd add_hfyu_left_prediction
...
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv
Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Måns Rullgård
35de5d2412
cosmetics: fix indentation after previous commit
...
Originally committed as revision 20062 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Måns Rullgård
952e872198
Drop unused args from vector_fmul_add_add, simpify code, and rename
...
The src3 and step arguments to vector_fmul_add_add() are always zero
and one, respectively. This removes these arguments from the function,
simplifies the code accordingly, and renames the function to better
match the new operation.
Originally committed as revision 20061 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Vitor Sessak
9263a05aab
Mark "i" parameter of vector_clipf_sse() as early-clobber
...
Originally committed as revision 19731 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Vitor Sessak
50e23ae9d3
Mark parameter src of vector_clipf() as const
...
Originally committed as revision 19729 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Vitor Sessak
0a68cd876e
SSE optimized vector_clipf(). 10% faster TwinVQ decoding.
...
Originally committed as revision 19728 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Diego Biurrun
9be6f0d2f8
Do not check for both CONFIG_VC1_DECODER and CONFIG_WMV3_DECODER,
...
the former depends upon the latter.
Originally committed as revision 19533 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Diego Biurrun
99e5a9d1ea
Do not redundantly check for both CONFIG_THEORA_DECODER and CONFIG_VP3_DECODER.
...
The Theora decoder depends on the VP3 decoder.
Originally committed as revision 19492 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Carl Eugen Hoyos
36904c4c9f
Icc 11.1 still does not align the stack pointer, disable some x264 functions.
...
Originally committed as revision 19454 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Jason Garrett-Glaser
73b02e2460
SSE version of clear_blocks
...
Originally committed as revision 19206 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
David Conrad
c21c835b8d
avg_ pixel functions need to use (dst+pix+1)>>1 to average with existing
...
pixels, not (dst+pix)>>1.
This makes the mmx functions bitexact with the C functions.
Originally committed as revision 18527 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
David Conrad
9bf0fdf378
VC1: extend MMX qpel MC to include MMX2 avg qpel
...
Originally committed as revision 18519 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
David Conrad
8013da7364
VC1: add and use avg_no_rnd chroma MC functions
...
Originally committed as revision 18518 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
David Conrad
c374691b28
Rename put_no_rnd_h264_chroma* to reflect its usage in VC1 only
...
Originally committed as revision 18517 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Stefano Sabatini
6b4343616c
Rename FF_MM_MMXEXT to FF_MM_MMX2, for both clarity and consistency
...
with libswscale.
Originally committed as revision 18330 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Reimar Döffinger
0be9e73e38
Mark line_skip3 asm argument as output-only instead of using av_uninit.
...
Originally committed as revision 18327 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Reimar Döffinger
d7460a9cac
Mark put_signed_pixels_clamped_mmx output operands as early-clobber because
...
they are. Hopefully fixes some FATE errors, too.
Originally committed as revision 18326 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Reimar Döffinger
531a3d2721
Use DECLARE_ASM_CONST for non-global ff_vector128 constant used via MANGLE
...
Originally committed as revision 18325 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Alex Converse
3dd6531208
Rewrite put_signed_pixels_clamped_mmx() to eliminate mmx.h from dsputil_mmx.c.
...
Originally committed as revision 18319 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Zuxy Meng
ecb24904fe
add SSE2 version of vp6_filter_diag
...
original patch by Zuxy Meng zuxy.meng _at_ gmail _dot_ com
Originally committed as revision 17195 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Sebastien Lucas
6af3c226c3
add MMX version of vp6_filter_diag
...
original patch by Sebastien Lucas sebastien.lucas _at_ gmail _dot_ com
Originally committed as revision 17194 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Aurelien Jacobs
5110b25e1e
convert ff_pw_64 into an xmm_reg for future use in vp6 sse code
...
Originally committed as revision 17192 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Diego Biurrun
d3a4b4e09c
Add check whether the compiler/assembler supports 10 or more operands.
...
thanks to Loren for some help with the asm statements
Originally committed as revision 17151 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
3daa434a40
ff_add_hfyu_median_prediction_mmx2
...
overall ffvhuff decoding speedup: 28% on core2, 25% on k8.
Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
David Conrad
137ae32760
Workaround for gcc 3.4 to align sh properly
...
Originally committed as revision 16797 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago