70 Commits (ba4aa656ce1c4e530bec4ed1b0fcf67eb20283f0)

Author SHA1 Message Date
Christophe GISQUET f9888520cc vp8dsp x86: perform rounding shift with a single instruction 13 years ago
Ronald S. Bultje a928ed3751 vp8: convert mbedge loopfilter x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje bee330e300 vp8: convert inner loopfilter x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje b4188f0d46 vp8: convert simple loopfilter x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje 8476ca3b4e vp8: convert idct x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje 21ffc78fd7 vp8: convert mc x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje 28170f1a39 vp8: convert loopfilter x86 assembly to use cpuflags(). 13 years ago
Ronald S. Bultje e25be47154 vp8: convert idct/mc x86 assembly to use cpuflags(). 13 years ago
Ronald S. Bultje 45549339bc vp8: disable mmx functions with sse/sse2 counterparts on x86-64. 13 years ago
Kieran Kunhya b1766c170c Move x264asm to libavutil. 13 years ago
Dave Yeo cc73511e8e Fix NASM include directive 13 years ago
Ronald S. Bultje b2c087871d Move x86util.asm from libavcodec/ to libavutil/. 13 years ago
Ronald S. Bultje 3a39195b1d Move x86inc.asm to libavutil/. 13 years ago
Daniel Kang d0005d347d Modify x86util.asm to ease transitioning to 10-bit H.264 assembly. 14 years ago
Diego Biurrun 888fa31eca Fix FSF address copy paste error in some license headers. 14 years ago
Mans Rullgard 2912e87a6c Replace FFmpeg with Libav in licence headers 14 years ago
Reimar Döffinger b1c32fb5e5 Use "d" suffix for general-purpose registers used with movd. 14 years ago
Ronald S. Bultje 3611c45ab7 Mark xmm registers as clobbered in simple loopfilter. Should fix the last 14 years ago
Ronald S. Bultje 684d608bde Fix segfaults in VP8 SIMD code on Win64 (and FATE/win64 failures). 14 years ago
Jason Garrett-Glaser 827d43bb9d VP8: move zeroing of luma DC block into the WHT 15 years ago
Ronald S. Bultje 6341838f3c Use word-writing instead of dword-writing (with two cached but otherwise 15 years ago
Ronald S. Bultje ab4d031889 Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster. 15 years ago
Jason Garrett-Glaser e25dee602f VP8: Much faster SSE2 MC 15 years ago
Ronald S. Bultje 48adb7e7a4 Enable no-loop memory/register saving for ssse3/sse4 also. 15 years ago
Ronald S. Bultje 2a180c69ea Save a register (or regsize of stackspace for x86-32) for the no-loop 15 years ago
Ronald S. Bultje bcd4aa6498 Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. this 15 years ago
Ronald S. Bultje 2208053bd3 Split pextrw macro-spaghetti into several opt-specific macros, this will make 15 years ago
Ronald S. Bultje 6de5b7c6b8 Fix obvious bug in assignment. Somehow, the test vectors don't test this... 15 years ago
Ronald S. Bultje e3f7bf774c Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so this 15 years ago
Jason Garrett-Glaser 3ae079a3c8 VP8: optimize DC-only chroma case in the same way as luma. 15 years ago
Jason Garrett-Glaser 51c9156438 VP8 asm: cosmetics (spacing) 15 years ago
Jason Garrett-Glaser 8a467b2d44 VP8: 30% faster idct_mb 15 years ago
Jason Garrett-Glaser c25c776708 VP8: clear DCT blocks in iDCT instead of using clear_blocks. 15 years ago
Ronald S. Bultje dc5eec8085 Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on 15 years ago
Ronald S. Bultje 003243c3c2 Fix and enable horizontal >=SSE2 mbedge loopfilter. 15 years ago
Jason Garrett-Glaser 8731dbd890 Eliminate one instruction in VP8 dc_add_sse4 15 years ago
Jason Garrett-Glaser 7dd224a42d Various VP8 x86 deblocking speedups 15 years ago
Jason Garrett-Glaser b8b231b5dc Make mmx VP8 WHT faster 15 years ago
Ronald S. Bultje e9e456d850 VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16) 15 years ago
Ronald S. Bultje 268821e76e Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder. 15 years ago
Ronald S. Bultje c60ed66dbe Revert r24339 (it causes fate failures on x86-64) - I'll figure out what's 15 years ago
Ronald S. Bultje 1878f685c0 Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions. 15 years ago
Ronald S. Bultje fb9bdf048c Be more efficient with registers or stack memory. Saves 8/16 bytes stack 15 years ago
Ronald S. Bultje 3facfc99da Change function prototypes for width=8 inner and mbedge loopfilter functions 15 years ago
Ronald S. Bultje 819b2dd2b1 Attempt to fix x86-64 testsuite on fate. 15 years ago
Ronald S. Bultje 6f323f1251 Remove duplicate define. 15 years ago
Ronald S. Bultje 889b2c26ee Revert 24270, it contained some stuff that shouldn't have been in there. 15 years ago
Ronald S. Bultje 2356a7834b Remove duplicate define. 15 years ago
Ronald S. Bultje ede1b9665a Give x86 r%d registers names, this will simplify implementation of the chroma 15 years ago
Ronald S. Bultje 526e831a46 Change return statement, the REP_RET is a mistake since the else case (x86-64, 15 years ago