120 Commits (8d39d67a789f1695d4db6cd4845ec4920336437e)

Author SHA1 Message Date
Loren Merritt 75ca1a5f70 gmc_mmx tweaks 19 years ago
Loren Merritt 703c8195a8 mmx implementation of 3-point GMC. (5x faster than C) 19 years ago
Loren Merritt 513fbd8e5a prefetch pixels for future motion compensation. 2-5% faster h264. 19 years ago
Loren Merritt fdd3057981 added mmx implementation of h264_chroma_mc2 19 years ago
Robert Edele e8600e5edc add MMX and SSE versions of ff_snow_inner_add_yblock 19 years ago
Robert Edele 2c9a0285d4 snow mmx+sse2 optimizations, part 4 19 years ago
Robert Edele 4567b4bdab Add the mmx and sse2 implementations of ff_snow_vertical_compose(). 19 years ago
Loren Merritt 548a1c8a35 h264_idct8_add_mmx 19 years ago
Loren Merritt 6da971f160 h264_idct_add only needs mmx1 19 years ago
Loren Merritt ef9d1d1575 h264: special case dc-only idct. ~1% faster overall 19 years ago
Steve L'Homme 68b51e58ce MSVC-compatible __align8/__align16 declaration 19 years ago
Diego Biurrun 5509bffa88 Update licensing information: The FSF changed postal address. 19 years ago
Diego Biurrun bb270c0896 COSMETICS: tabs --> spaces, some prettyprinting 19 years ago
Diego Biurrun 115329f160 COSMETICS: Remove all trailing whitespace. 19 years ago
Loren Merritt ea15df8048 use sse16_sse2() in nsse 19 years ago
Loren Merritt a6624e21cb faster h264_chroma_mc8_mmx, added h264_chroma_mc4_mmx. 20 years ago
Loren Merritt b926572aa9 h264 mmx weighted prediction. up to 3% overall speedup. 20 years ago
Loren Merritt 5693c08356 sse2 16x16 sum squared diff (306=>268 cycles on a K8) 20 years ago
Michael Niedermayer 12e9668119 replace a few mov + psrlq with pshufw, there are more cases which could benefit from this but they would require us to duplicate some functions ... 20 years ago
Reimar Döffinger cd7af76d9e Fix compile without CONFIG_GPL, misplaced #endif caused a missing }. 20 years ago
Michael Niedermayer 84740d5980 xvids mmx&mmx2 idcts 20 years ago
Måns Rullgård 79396ac685 Kill some compiler warnings. Compiled code verified identical after changes. 20 years ago
Loren Merritt d2bb7db135 sort H.264 mmx dsp functions into their own file 20 years ago
Michael Niedermayer c26ae41db2 adding a few const 20 years ago
Loren Merritt 1d62fc8560 MMX for H.264 iDCT (adapted from x264) 20 years ago
Zoltán Hidvégi 3072f0cb2e MMX code for (put|avg)_h264_chroma_mc8 20 years ago
Loren Merritt 5cf08f2393 H.264 deblocking optimizations (mmx for chroma_bS4 case, convert existing cases to 8-bit math) 20 years ago
Michael Niedermayer 5773a74669 porting the mmx&sse2 (sse2 untested) vp3 idcts to the lavc idct API 20 years ago
Michael Niedermayer b178f758fa disabling vp3 mmx&mmx2 idcts, they must be ported over to the lavc idct API, ill port the vp3 c idct 20 years ago
Michael Niedermayer c998bdd9a0 fix PIC 20 years ago
Loren Merritt 42251a2a4f MMX for H.264 deblocking filter 20 years ago
Martin Drab 4d9ae03b09 optimization and gcc 4.0 bug workaround patch by (Martin Drab >drab kepler.fjfi.cvut cz<) 20 years ago
Aurelien Jacobs 053dea12f2 adapting existing mmx/mmx2/sse/3dnow optimizations so they work on x86_64 patch by (Aurelien Jacobs <aurel at gnuage dot org>) 21 years ago
Michael Niedermayer 178fcca848 1/2 resolution decoding 21 years ago
Michael Niedermayer e69538fa60 h264_qpel8_hv_lowpass_mmx2/3dnow 21 years ago
Michael Niedermayer e772bb8a82 h264_qpel4_hv_lowpass_mmx2/3dnow 21 years ago
Michael Niedermayer 56d8bd5659 optimization 21 years ago
Michael Niedermayer a6e39f45a2 optimization 21 years ago
Michael Niedermayer ed8ffdf46c optimization 21 years ago
Michael Niedermayer 437525c473 h264 luma motion compensation in mmx2/3dnow 21 years ago
Michael Niedermayer d6af6b0350 10000l fix and use more mmx2/3dnow code for mpeg4 qpel which has been written and commited long time ago but appearently never used, qpel motion compensation is 5% faster 21 years ago
Michael Niedermayer 1ec4df0fa8 sse8 and nsse in mmx 21 years ago
Michael Niedermayer e96682e6f4 some of the warning fixes by (Michael Roitzsch <mroi at users dot sourceforge dot net>) 21 years ago
Mike Melanson 7daabccb5d move the 0x80 vector outside of the function, thus saving the compiler 21 years ago
Mike Melanson f9ed9d8584 separate out put_signed_pixels_clamped() into its own function and 21 years ago
Mike Melanson 116824d0aa reorganize and simplify the VP3 IDCT stuff 21 years ago
Mike Melanson 38acbc3cb9 hook up support for SSE2-optimized VP3 IDCT 21 years ago
Mike Melanson 01456e8e86 use optimized VP3 functions where appropriate 21 years ago
Dmitry Baryshkov 5c0513bda7 attribute used patch by (mitya at school dot ioffe dot ru (Dmitry Baryshkov)) 21 years ago
Michael Niedermayer 364a179749 quantizer noise shaping optimization 21 years ago