80 Commits (b7d77f8e64d5e30982671e861f63654709111a8e)

Author SHA1 Message Date
Ronald S. Bultje ce58642ed0 x86inc: support stack mem allocation and re-alignment in PROLOGUE. 12 years ago
Ronald S. Bultje 6f40e9f070 x86inc: support stack mem allocation and re-alignment in PROLOGUE 12 years ago
Diego Biurrun 26301caaa1 x86: mmx2 ---> mmxext in asm constructs 12 years ago
Diego Biurrun 04581c8c77 x86: yasm: Use complete source path for macro helper %includes 12 years ago
Diego Biurrun 6860b4081d x86: include x86inc.asm in x86util.asm 12 years ago
Mans Rullgard a3df4781f4 x86: add colons after labels 13 years ago
Loren Merritt 4d4752366f x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros 13 years ago
Christophe GISQUET f9888520cc vp8dsp x86: perform rounding shift with a single instruction 13 years ago
Ronald S. Bultje a928ed3751 vp8: convert mbedge loopfilter x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje bee330e300 vp8: convert inner loopfilter x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje b4188f0d46 vp8: convert simple loopfilter x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje 8476ca3b4e vp8: convert idct x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje 21ffc78fd7 vp8: convert mc x86 assembly to use named arguments. 13 years ago
Ronald S. Bultje 28170f1a39 vp8: convert loopfilter x86 assembly to use cpuflags(). 13 years ago
Ronald S. Bultje e25be47154 vp8: convert idct/mc x86 assembly to use cpuflags(). 13 years ago
Ronald S. Bultje 45549339bc vp8: disable mmx functions with sse/sse2 counterparts on x86-64. 13 years ago
Kieran Kunhya b1766c170c Move x264asm to libavutil. 13 years ago
Dave Yeo cc73511e8e Fix NASM include directive 14 years ago
Ronald S. Bultje b2c087871d Move x86util.asm from libavcodec/ to libavutil/. 14 years ago
Ronald S. Bultje 3a39195b1d Move x86inc.asm to libavutil/. 14 years ago
Daniel Kang d0005d347d Modify x86util.asm to ease transitioning to 10-bit H.264 assembly. 14 years ago
Diego Biurrun 888fa31eca Fix FSF address copy paste error in some license headers. 14 years ago
Mans Rullgard 2912e87a6c Replace FFmpeg with Libav in licence headers 14 years ago
Reimar Döffinger b1c32fb5e5 Use "d" suffix for general-purpose registers used with movd. 15 years ago
Ronald S. Bultje 3611c45ab7 Mark xmm registers as clobbered in simple loopfilter. Should fix the last 15 years ago
Ronald S. Bultje 684d608bde Fix segfaults in VP8 SIMD code on Win64 (and FATE/win64 failures). 15 years ago
Jason Garrett-Glaser 827d43bb9d VP8: move zeroing of luma DC block into the WHT 15 years ago
Ronald S. Bultje 6341838f3c Use word-writing instead of dword-writing (with two cached but otherwise 15 years ago
Ronald S. Bultje ab4d031889 Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster. 15 years ago
Jason Garrett-Glaser e25dee602f VP8: Much faster SSE2 MC 15 years ago
Ronald S. Bultje 48adb7e7a4 Enable no-loop memory/register saving for ssse3/sse4 also. 15 years ago
Ronald S. Bultje 2a180c69ea Save a register (or regsize of stackspace for x86-32) for the no-loop 15 years ago
Ronald S. Bultje bcd4aa6498 Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. this 15 years ago
Ronald S. Bultje 2208053bd3 Split pextrw macro-spaghetti into several opt-specific macros, this will make 15 years ago
Ronald S. Bultje 6de5b7c6b8 Fix obvious bug in assignment. Somehow, the test vectors don't test this... 15 years ago
Ronald S. Bultje e3f7bf774c Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so this 15 years ago
Jason Garrett-Glaser 3ae079a3c8 VP8: optimize DC-only chroma case in the same way as luma. 15 years ago
Jason Garrett-Glaser 51c9156438 VP8 asm: cosmetics (spacing) 15 years ago
Jason Garrett-Glaser 8a467b2d44 VP8: 30% faster idct_mb 15 years ago
Jason Garrett-Glaser c25c776708 VP8: clear DCT blocks in iDCT instead of using clear_blocks. 15 years ago
Ronald S. Bultje dc5eec8085 Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on 15 years ago
Ronald S. Bultje 003243c3c2 Fix and enable horizontal >=SSE2 mbedge loopfilter. 15 years ago
Jason Garrett-Glaser 8731dbd890 Eliminate one instruction in VP8 dc_add_sse4 15 years ago
Jason Garrett-Glaser 7dd224a42d Various VP8 x86 deblocking speedups 15 years ago
Jason Garrett-Glaser b8b231b5dc Make mmx VP8 WHT faster 15 years ago
Ronald S. Bultje e9e456d850 VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16) 15 years ago
Ronald S. Bultje 268821e76e Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder. 15 years ago
Ronald S. Bultje c60ed66dbe Revert r24339 (it causes fate failures on x86-64) - I'll figure out what's 15 years ago
Ronald S. Bultje 1878f685c0 Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions. 15 years ago
Ronald S. Bultje fb9bdf048c Be more efficient with registers or stack memory. Saves 8/16 bytes stack 15 years ago