David Conrad
21383da8c4
Let ff_pw_8 be used as an SSE constant
...
Originally committed as revision 15052 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
ebceaa1cd5
gcc chokes on the 7 registers needed for float_to_int16_interleave6 (even inside HAVE_7REGS), so write it in yasm
...
Originally committed as revision 14749 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
ee46753739
gcc chokes on xmm constraints, so pessimize int32_to_float_fmul_scalar_sse a little
...
Originally committed as revision 14748 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
675872382f
special case 6 channel version of float_to_int16_interleave
...
5% faster ac3
Originally committed as revision 14744 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
911e21a306
simd int->float
...
20% faster ac3 if downmixing, 15% if not
Originally committed as revision 14743 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
ac2e556456
simd downmix
...
13% faster ac3 if downmixing
Originally committed as revision 14742 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
862b98d42c
cosmetics in dsp init
...
Originally committed as revision 14704 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Uoti Urpala
f769b746aa
Mark add_png_paeth_prediction_* functions which are only used within this file
...
as static. patch by Uoti Urpala, uoti.urpala pp1.inet fi
Originally committed as revision 14509 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
5eb0f2a425
float_to_int16_interleave: change src to an array of pointers instead of assuming it's contiguous.
...
this has no immediate effect, but will allow it to be used in more codecs.
Originally committed as revision 14252 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
4342a7f30b
10l, float_to_int16_interleave_sse/3dnow wrote the wrong samples
...
Originally committed as revision 14236 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
b9fa32082c
exploit mdct symmetry
...
2% faster vorbis on conroe, k8. 7% on celeron.
Originally committed as revision 14207 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
f27e1d645e
simplify vorbis windowing
...
Originally committed as revision 14205 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Kostya Shishkov
d7e1fc4254
SSE2 optimizations for Monkey's Audio decoder vector functions
...
Originally committed as revision 14161 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Michael Niedermayer
e98750c373
float_to_int16_sse2()
...
20% faster than sse
Originally committed as revision 14138 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Michael Niedermayer
35ee72b1d7
1 c-asm loop less and 1x unroll of float_to_int16_sse()
...
25% faster
Originally committed as revision 14104 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Michael Niedermayer
560fa9bf51
Fix x86-64
...
Originally committed as revision 14103 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Michael Niedermayer
63b737d4f9
dont use C-asm loops and unroll once float_to_int16_3dnow()
...
30% faster
Originally committed as revision 14102 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Reimar Döffinger
00eebe3d6a
Fix add_bytes_mmx and add_bytes_l2_mmx for w < 16
...
Originally committed as revision 13877 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Diego Biurrun
245976da2a
Use full path for #includes from another directory.
...
Originally committed as revision 13098 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Ramiro Polla
40d0e665d0
Do not misuse long as the size of a register in x86.
...
typedef x86_reg as the appropriate size and use it instead.
Originally committed as revision 13081 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Alexander Strange
f73a6393e7
Add a new xvid-style IDCT using SSE2.
...
Originally committed as revision 12843 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Alexander Strange
54a0b6e590
Add a header file to declare Xvid IDCT functions.
...
patch by Alexander Strange, astrange ithinksw com
Originally committed as revision 12794 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
ce53144bac
h264 chroma mc ssse3
...
width8: 180->92, width4: 78->63 cycles (core2)
Originally committed as revision 12661 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Zuxy Meng
9e8e6d318c
Add missed call to ff_cavsdsp_init_3dnow() in dsputil_init_mmx()
...
Originally committed as revision 12540 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Michael Niedermayer
943032b155
Hardcode register to prevent aparent miscompilation.
...
Fixes regression tests with gcc 2.95.
Originally committed as revision 12512 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Michael Niedermayer
dea00a4623
remove unused temp
...
Originally committed as revision 12511 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Aurelien Jacobs
5a6a9e78ab
move draw_edges() into dsputil
...
Originally committed as revision 12309 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Aurelien Jacobs
97d1d009e2
split encoding part of dsputil_mmx into its own file
...
Originally committed as revision 12223 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Reimar Döffinger
78d3d94f14
__asm __volatile -> asm volatile, improves code consistency and works
...
(as far as that is possible) with the Sun C compiler.
Originally committed as revision 12188 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
4a9ca0a279
simd and unroll png_filter_row
...
cycles per 1000 pixels on core2:
left: 9211->5170
top: 9283->2138
avg: 12215->7611
paeth: 64024->17360
overall rgb png decoding speed: +45%
overall greyscale png decoding speed: +6%
Originally committed as revision 12164 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
1d67b037f7
sse2 h264 motion compensation. not new code, just separate out the cases that didn't need ssse3.
...
Originally committed as revision 11877 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
20d565be6d
put loop counter in a register if possible. makes some of the qpel functions 3% faster.
...
Originally committed as revision 11876 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
a2b7bc8e71
constant was excessively aligned
...
Originally committed as revision 11874 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
ddf969705f
ssse3 h264 motion compensation.
...
25% faster tham mmx on core2, 35% if you discount fullpel, 4% overall decoding.
Originally committed as revision 11871 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
fa9b873e08
clean up an ugliness introduced in r11826. this syntax will require fewer changes when adding future sse2 code.
...
Originally committed as revision 11868 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
b2f775860b
reduce code duplication
...
Originally committed as revision 11863 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
b313e8159c
avg_pixels4_mmx2
...
Originally committed as revision 11829 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
6c01d0069d
use mmx2/3dnow avg functions in avg_qpel*_mc00
...
Originally committed as revision 11828 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Loren Merritt
ed5d7a531c
ff_h264_idct8_add_sse2.
...
compared to mmx, 217->126 cycles on core2, 262->220 on k8.
Originally committed as revision 11826 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Baptiste Coudurier
066e0cc50d
add parenthesis, fix warning: i386/dsputil_mmx.c:2618: warning: suggest parentheses around arithmetic in operand of |
...
Originally committed as revision 11673 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Baptiste Coudurier
afa4778989
fix prototypes, remove warning: i386/dsputil_mmx.c:3594: warning: assignment from incompatible pointer type
...
Originally committed as revision 11672 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Reimar Döffinger
27215c6bf4
Use DECLARE_ALIGNED
...
Originally committed as revision 11630 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Christophe Gisquet
28748a9128
Factorize some duplicated code from CAVS and H.264 into a common file.
...
patch by Christophe Gisquet, christophe.gisquet free fr
Originally committed as revision 11504 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Christophe Gisquet
9fa3572903
add MMX version for put_no_rnd_h264_chroma_mc8_c, used in VC-1 decoding.
...
patch by Christophe GISQUET %christophe P gisquet A free P fr%
original thread:
date: Nov 25, 2007 12:35 AM
subject: Re: [FFmpeg-devel] MMX version for put_no_rnd_h264_chroma_mc8_c
Originally committed as revision 11298 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Diego Biurrun
9fbd14acb8
Fix typo in macro name: WARPER8_16_SQ --> WRAPPER8_16_SQ.
...
Originally committed as revision 11296 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Aurelien Jacobs
407c50a024
move FLAC mmx dsp to its own file
...
Originally committed as revision 11244 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Diego Biurrun
571bf37f6d
typo/clarification
...
Originally committed as revision 11201 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Vitor Sessak
52b541ad79
spelling
...
Originally committed as revision 11122 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Aurelien Jacobs
bb6cc730e5
remove some unused ff_p* vars from dsputil
...
Originally committed as revision 11106 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago
Aurelien Jacobs
dbb5fdbdc8
remove useless #ifdef around extern declaration
...
Originally committed as revision 11105 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago