FFmpeg

Commit Graph

Author	SHA1	Message	Date
Ronald S. Bultje	8db00081a3	x86: hpeldsp: Move half-pel assembly from dsputil to hpeldsp Signed-off-by: Martin Storsjö <martin@martin.st>	12 years ago
Ronald S. Bultje	b93b27edb0	dsputil: Make dsputil selectable Signed-off-by: Martin Storsjö <martin@martin.st>	12 years ago
Ronald S. Bultje	610b18e2e3	x86: qpel: Move fullpel and l2 functions to a separate file This way, they can be shared between mpeg4qpel and h264qpel without requiring either one to be compiled unconditionally. Signed-off-by: Martin Storsjö <martin@martin.st>	12 years ago
Daniel Kang	9acd23d655	x86: dsputil: Fix h263 loop filter link error in some configurations This was caused by unconditionally referencing a conditionally compiled table. Now the code is also compiled conditionally. Signed-off-by: Diego Biurrun <diego@biurrun.de>	12 years ago
Martin Storsjö	a846dccb29	h264chroma: x86: Fix building with yasm disabled Signed-off-by: Martin Storsjö <martin@martin.st>	12 years ago
Diego Biurrun	79dad2a932	dsputil: Separate h264chroma	12 years ago
Daniel Kang	71155d7b41	dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	12 years ago
Mans Rullgard	e9d817351b	dsputil: Separate h264 qpel The sh4 optimizations are removed, because the code is 100% identical to the C code, so it is unlikely to provide any real practical benefit. Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	12 years ago
Ronald S. Bultje	2e4bb99f4d	vorbisdsp: convert x86 simd functions from inline asm to yasm.	12 years ago
Ronald S. Bultje	fef906c77c	Move vorbis_inverse_coupling from dsputil to vorbisdspcontext. Conveniently (together with Justin's earlier patches), this makes our vorbis decoder entirely independent of dsputil.	12 years ago
Diego Biurrun	a0c5917f86	Drop Snow codec Snow is a toy codec with no real-world use and horrible code.	12 years ago
Ronald S. Bultje	8c53d39e7f	lavc: introduce VideoDSPContext Move some functions from dsputil. The idea is that videodsp contains functions that are useful for a large and varied set of video decoders. Currently, it contains emulated_edge_mc() and prefetch(). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	12 years ago
Daniel Kang	610e00b359	x86: h264: Convert 8-bit QPEL inline assembly to YASM Signed-off-by: Diego Biurrun <diego@biurrun.de>	12 years ago
Janne Grunau	7e522859fc	x86: vc1: call ff_vc1dsp_init_x86() under if (ARCH_X86)	12 years ago
Janne Grunau	cb36febcbc	x86: cavs: call ff_cavsdsp_init_x86() under if (ARCH_X86)	12 years ago
Janne Grunau	f101eab1be	x86: call most of the x86 dsp init functions under if (ARCH_X86) Rename the called dsp init functions to *_init_x86.	12 years ago
Diego Biurrun	a84edbacaf	x86: dsputil: Only compile motion_est code when encoders are enabled	12 years ago
Diego Biurrun	2e6f93a284	x86: Always compile files with functions that are called unconditionally	12 years ago
Diego Biurrun	bcc45d6348	x86: avcodec: Drop silly "_mmx" suffixes from filenames	12 years ago
Diego Biurrun	efbd04c332	x86: avcodec: Drop silly "_sse" suffixes from filenames	12 years ago
Diego Biurrun	3f02c533f3	build: fft: x86: Drop unused YASM-OBJS-FFT- variable	12 years ago
Diego Biurrun	dc40285427	x86: mpegvideo: more sensible names for optimization file and init function	12 years ago
Diego Biurrun	d211547ddd	x86: mpegvideoenc: Split optimizations off into a separate file	12 years ago
Diego Biurrun	26ce9aec03	dnxhdenc: x86: more sensible names for optimization file and init function	12 years ago
Diego Biurrun	6fa488678f	build: x86: Only compile mpegvideo optimizations when necessary	12 years ago
Diego Biurrun	6961bdface	x86: avcodec: Consistently name all init files	12 years ago
Diego Biurrun	29cfdd3767	x86: avcodec: Appropriately name files containing only init functions	12 years ago
Diego Biurrun	3b9e832e17	x86: Drop silly "_yasm" suffixes from filenames	12 years ago
Mans Rullgard	ec7c501ed5	x86: remove libmpeg2 mmx(ext) idct functions These functions are not faster than other mmx implementations on any hardware I have been able to test on, and they are horribly inaccurate. There is thus no reason to ever use them. Signed-off-by: Mans Rullgard <mans@mansr.com>	12 years ago
Ronald S. Bultje	b6a3849adb	fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64. 64-bit CPUs always have SSE available, thus there is no need to compile in the 3dnow functions. This results in smaller binaries.	12 years ago
Mans Rullgard	28f9ab7029	vp3: move idct and loop filter pointers to new vp3dsp context This moves all VP3-specific function pointers from dsputil to a new vp3dsp context. There is no reason to ever use the VP3 IDCT where an MPEG2 IDCT is expected or vice versa. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	ab9f987661	build: add CONFIG_VP3DSP, reduce repetition in OBJS lists Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	8299260470	x86: fft: convert sse inline asm to yasm	13 years ago
Diego Biurrun	7bb3a302fe	build: Consistently handle conditional compilation for all optimization OBJS.	13 years ago
Diego Biurrun	ad0e31f134	build: prettyprinting cosmetics	13 years ago
Diego Biurrun	915a2a0a65	x86: conditionally compile H.264 QPEL optimizations	13 years ago
Christophe GISQUET	34454c761f	SBR DSP x86: implement SSE sbr_sum_square_sse The 32bits targets have been compiled with -mfpmath=sse for proper reference. sbr_sum_square C /32bits: 82c (unrolled)/102c C /64bits: 69c (unrolled)/82c SSE/32bits: 42c SSE/64bits: 31c Use of SSE4.1 dpps to perform the final sum is slower. Not unrolling to perform 8 operations in a loop yields 10 more cycles. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Ronald S. Bultje	7e4d9d5d45	win64: add a XMM clobber test configure option. This will be useful to test more aggressively for failures to mark XMM registers as clobbered in Win64 builds, and prevent regressions thereof. Based on a patch by Ramiro Polla <ramiro.polla@gmail.com>	13 years ago
Christophe Gisquet	e5c9de2ab7	rv40: x86 SIMD for biweight Provide MMX, SSE2 and SSSE3 versions, with a fast-path when the weights are multiples of 512 (which is often the case when the values round up nicely). *_TIMER report for the 16x16 and 8x8 cases: C: 9015 decicycles in 16, 524257 runs, 31 skips 2656 decicycles in 8, 524271 runs, 17 skips MMX: 4156 decicycles in 16, 262090 runs, 54 skips 1206 decicycles in 8, 262131 runs, 13 skips MMX on fast-path: 2760 decicycles in 16, 524222 runs, 66 skips 995 decicycles in 8, 524252 runs, 36 skips SSE2: 2163 decicycles in 16, 262131 runs, 13 skips 832 decicycles in 8, 262137 runs, 7 skips SSE2 with fast path: 1783 decicycles in 16, 524276 runs, 12 skips 711 decicycles in 8, 524283 runs, 5 skips SSSE3: 2117 decicycles in 16, 262136 runs, 8 skips 814 decicycles in 8, 262143 runs, 1 skips SSSE3 with fast path: 1315 decicycles in 16, 524285 runs, 3 skips 578 decicycles in 8, 524286 runs, 2 skips This means around a 4% speedup for some sequences. Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Diego Biurrun	91bafb52ae	x86: Give RV40 init file a more suitable name.	13 years ago
Ronald S. Bultje	59f474b49d	png: convert DSP functions to yasm.	13 years ago
Ronald S. Bultje	e92003514d	png: move DSP functions to their own DSP context.	13 years ago
Christophe GISQUET	3faa303a47	rv34: DC-only inverse transform When decoding coefficients, detect whether the block is DC-only, and take advantage of this knowledge to perform DC-only inverse transform. This is achieved by: - first, changing the 108x4 element modulo_three_table into a 108 element table (kind of base4), and accessing each value using mask and shifts. - then, checking low bits for 0 (as they represent the presence of higher frequency coefficients) Also provide x86 SIMD code for the DC-only inverse transform. Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>	13 years ago
Vitor Sessak	39df0c434c	mpegaudiodec: optimized iMDCT transform Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Diego Biurrun	30bbd5cbc0	x86: conditionally compile dnxhd encoder optimizations	13 years ago
Diego Biurrun	88b9735753	build: conditionally compile x86 H.264 chroma optimizations	13 years ago
Ronald S. Bultje	e3f530feca	prores: idct sse2/sse4 optimizations. ~3.0-3.5x as fast as original C version, 1.6x as fast overall.	13 years ago
Kostya Shishkov	d241f51e0f	Move RV3/4-specific DSP functions into their own context Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Daniel Kang	9bfa5363da	H.264: Add x86 assembly for 10-bit H.264 qpel functions. Mainly ported from 8-bit H.264 qpel. Some code ported from x264. LGPL ok by author. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	14 years ago
Daniel Kang	84e70ef004	h264: Add x86 assembly for 10-bit weight/biweight H.264 functions. Mainly ported from 8-bit H.264 weight/biweight. Signed-off-by: Diego Biurrun <diego@biurrun.de>	14 years ago

1 2 3

135 Commits (1c6bb813284732d9a1acacfe99522d9f66ebf73e)