Code mostly inspired by vp8's MC, however:
- its MMX2 horizontal filter is worse because it can't take advantage of
the coefficient redundancy
- that same coefficient redundancy allows better code for non-SSSE3 versions
Benchmark (rounded to tens of unit):
V8x8 H8x8 2D8x8 V16x16 H16x16 2D16x16
C 445 358 985 1785 1559 3280
MMX* 219 271 478 714 929 1443
SSE2 131 158 294 425 515 892
SSSE3 120 122 248 387 390 763
End result is overall around a 15% speedup for SSSE3 version (on 6 sequences);
all loop filter functions now take around 55% of decoding time, while luma MC
dsp functions are around 6%, chroma ones are 1.3% and biweight around 2.3%.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
format), LGPL'ed with permission from Jason and Loren. This includes mmx2
code, so remove inline asm from h264dsp_mmx.c accordingly.
Originally committed as revision 25031 to svn://svn.ffmpeg.org/ffmpeg/trunk
still #included in dsputil_mmx.c and is part of DSPContext, and h264dsp_mmx.c,
which represents H264DSPContext and is now compiled on its own.
Originally committed as revision 25018 to svn://svn.ffmpeg.org/ffmpeg/trunk
This moves the DWT functions from snow.c and dsputil.c to a file of
their own. A new struct, DWTContext, holds the function pointers
previously part of DSPContext.
Originally committed as revision 22522 to svn://svn.ffmpeg.org/ffmpeg/trunk
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.
Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk
Neither the asm() nor the __asm__() keyword is part of the C99
standard, but while GCC accepts the former in C89 syntax, it is not
accepted in C99 unless GNU extensions are turned on (with -fasm). The
latter form is accepted in any syntax as an extension (without
requiring further command-line options).
Sun Studio C99 compiler also does not accept asm() while accepting
__asm__(), albeit reporting warnings that it's not valid C99 syntax.
Originally committed as revision 15627 to svn://svn.ffmpeg.org/ffmpeg/trunk
Consistently apply this rule: the guard name is obtained from the
filename by stripping the leading "lib", converting '/' and '.' to
'_' and uppercasing the resulting name. Guard names in the root
directory have to be prefixed by "FFMPEG_".
Originally committed as revision 15120 to svn://svn.ffmpeg.org/ffmpeg/trunk
Patch by Victor Pollex victor pollex web de
Original thread: [PATCH] mmx implementation of vc-1 inverse transformations
Date: 06/21/2008 03:37 PM
Originally committed as revision 14108 to svn://svn.ffmpeg.org/ffmpeg/trunk
25% faster tham mmx on core2, 35% if you discount fullpel, 4% overall decoding.
Originally committed as revision 11871 to svn://svn.ffmpeg.org/ffmpeg/trunk