186 Commits (d93fdcbf5c0e70ca03aaad2581fd328f277cd7cb)

Author SHA1 Message Date
Jason Garrett-Glaser 98fe09df7b Add file missing in r24702 14 years ago
Eli Friedman c12d6955e2 H.264: SSE2/SSSE3 weighted prediction asm 14 years ago
Måns Rullgård f079a64aea Move cavs dsp functions to their own struct 14 years ago
Jason Garrett-Glaser 8b9b5e085f VP5/6/8: add one inline missed in r24677 14 years ago
Jason Garrett-Glaser 827d43bb9d VP8: move zeroing of luma DC block into the WHT 14 years ago
Ronald S. Bultje 6341838f3c Use word-writing instead of dword-writing (with two cached but otherwise 14 years ago
Vitor Sessak fa738b3ad1 Remove x86/mmx.h. It is not used anymore and has been deprecated for years. 14 years ago
Vitor Sessak de4bc44abb Convert deinterlacing MMX code to YASM 14 years ago
Vitor Sessak 740dfe7012 Fix compilation in x86_64. I broke it with r24580. 14 years ago
Vitor Sessak 2c3dda6838 Translate libmpeg2 MMX IDCT to plain asm 14 years ago
Ronald S. Bultje ab4d031889 Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster. 15 years ago
Jason Garrett-Glaser e25dee602f VP8: Much faster SSE2 MC 15 years ago
Ronald S. Bultje 48adb7e7a4 Enable no-loop memory/register saving for ssse3/sse4 also. 15 years ago
Ronald S. Bultje 2a180c69ea Save a register (or regsize of stackspace for x86-32) for the no-loop 15 years ago
Ronald S. Bultje bcd4aa6498 Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. this 15 years ago
Ronald S. Bultje 2208053bd3 Split pextrw macro-spaghetti into several opt-specific macros, this will make 15 years ago
Ronald S. Bultje 6de5b7c6b8 Fix obvious bug in assignment. Somehow, the test vectors don't test this... 15 years ago
Ronald S. Bultje e3f7bf774c Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so this 15 years ago
Eli Friedman 3611e7a309 Inline asm for VP56 arith coder 15 years ago
Jason Garrett-Glaser 3ae079a3c8 VP8: optimize DC-only chroma case in the same way as luma. 15 years ago
Jason Garrett-Glaser 51c9156438 VP8 asm: cosmetics (spacing) 15 years ago
Jason Garrett-Glaser 8a467b2d44 VP8: 30% faster idct_mb 15 years ago
Jason Garrett-Glaser c25c776708 VP8: clear DCT blocks in iDCT instead of using clear_blocks. 15 years ago
Ronald S. Bultje dc5eec8085 Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on 15 years ago
Ronald S. Bultje 003243c3c2 Fix and enable horizontal >=SSE2 mbedge loopfilter. 15 years ago
Loren Merritt c7b1d9768c relicense h264 deblock sse2 to lgpl 15 years ago
Loren Merritt 532e769701 sync yasm macros from x264 15 years ago
Jason Garrett-Glaser 8731dbd890 Eliminate one instruction in VP8 dc_add_sse4 15 years ago
Jason Garrett-Glaser 7dd224a42d Various VP8 x86 deblocking speedups 15 years ago
Jason Garrett-Glaser b8b231b5dc Make mmx VP8 WHT faster 15 years ago
David Conrad af521abc28 Add header declarations for mmx/sse constants missing them 15 years ago
David Conrad c7eec58170 Move ff_pw_* from vc1dsp_mmx.c to dsputil_mmx.c 15 years ago
Ronald S. Bultje e9e456d850 VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16) 15 years ago
Ronald S. Bultje 268821e76e Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder. 15 years ago
Ronald S. Bultje c60ed66dbe Revert r24339 (it causes fate failures on x86-64) - I'll figure out what's 15 years ago
Ronald S. Bultje 6526976f0c Remove FF_MM_SSE2/3 flags for CPUs where this is generally not faster than 15 years ago
Ronald S. Bultje 1878f685c0 Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions. 15 years ago
Ronald S. Bultje fb9bdf048c Be more efficient with registers or stack memory. Saves 8/16 bytes stack 15 years ago
Ronald S. Bultje 3facfc99da Change function prototypes for width=8 inner and mbedge loopfilter functions 15 years ago
Loren Merritt 1ee076b1b1 more credits to D. J. Bernstein for fft 15 years ago
Ronald S. Bultje 819b2dd2b1 Attempt to fix x86-64 testsuite on fate. 15 years ago
Ronald S. Bultje 6f323f1251 Remove duplicate define. 15 years ago
Ronald S. Bultje 889b2c26ee Revert 24270, it contained some stuff that shouldn't have been in there. 15 years ago
Ronald S. Bultje 2356a7834b Remove duplicate define. 15 years ago
Ronald S. Bultje ede1b9665a Give x86 r%d registers names, this will simplify implementation of the chroma 15 years ago
Ronald S. Bultje 526e831a46 Change return statement, the REP_RET is a mistake since the else case (x86-64, 15 years ago
Ronald S. Bultje a711eb4829 VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations. 15 years ago
David Conrad faa26db28b MMX/SSE VC1 loop filter 15 years ago
David Conrad 7af8fbd348 Make ff_pw_4 128 bits 15 years ago
Vitor Sessak 881fd7a62f Move SSE optimized 32-point DCT to its own file. Should fix breakage with YASM 15 years ago