FFmpeg

Commit Graph

Author	SHA1	Message	Date
Diego Biurrun	a84edbacaf	x86: dsputil: Only compile motion_est code when encoders are enabled	12 years ago
Diego Biurrun	2e6f93a284	x86: Always compile files with functions that are called unconditionally	12 years ago
Diego Biurrun	bcc45d6348	x86: avcodec: Drop silly "_mmx" suffixes from filenames	12 years ago
Diego Biurrun	efbd04c332	x86: avcodec: Drop silly "_sse" suffixes from filenames	12 years ago
Diego Biurrun	3f02c533f3	build: fft: x86: Drop unused YASM-OBJS-FFT- variable	12 years ago
Diego Biurrun	dc40285427	x86: mpegvideo: more sensible names for optimization file and init function	12 years ago
Diego Biurrun	d211547ddd	x86: mpegvideoenc: Split optimizations off into a separate file	12 years ago
Diego Biurrun	26ce9aec03	dnxhdenc: x86: more sensible names for optimization file and init function	12 years ago
Diego Biurrun	6fa488678f	build: x86: Only compile mpegvideo optimizations when necessary	12 years ago
Diego Biurrun	6961bdface	x86: avcodec: Consistently name all init files	12 years ago
Diego Biurrun	29cfdd3767	x86: avcodec: Appropriately name files containing only init functions	12 years ago
Diego Biurrun	3b9e832e17	x86: Drop silly "_yasm" suffixes from filenames	12 years ago
Ronald S. Bultje	9f14cd91b5	fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64. 64-bit CPUs always have SSE available, thus there is no need to compile in the 3dnow functions. This results in smaller binaries.	12 years ago
Mans Rullgard	ec7c501ed5	x86: remove libmpeg2 mmx(ext) idct functions These functions are not faster than other mmx implementations on any hardware I have been able to test on, and they are horribly inaccurate. There is thus no reason to ever use them. Signed-off-by: Mans Rullgard <mans@mansr.com>	12 years ago
Ronald S. Bultje	b6a3849adb	fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64. 64-bit CPUs always have SSE available, thus there is no need to compile in the 3dnow functions. This results in smaller binaries.	12 years ago
Mans Rullgard	28f9ab7029	vp3: move idct and loop filter pointers to new vp3dsp context This moves all VP3-specific function pointers from dsputil to a new vp3dsp context. There is no reason to ever use the VP3 IDCT where an MPEG2 IDCT is expected or vice versa. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	ab9f987661	build: add CONFIG_VP3DSP, reduce repetition in OBJS lists Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Nicolas George	91765594dd	Revert "Revert "x86: fft: convert sse inline asm to yasm"" This reverts commit `fd91a3ec44`. The bug it introduced has been fixed.	13 years ago
Nicolas George	fd91a3ec44	Revert "x86: fft: convert sse inline asm to yasm" This reverts commit `8299260470`. It breaks shared builds on x86_64.	13 years ago
Mans Rullgard	8299260470	x86: fft: convert sse inline asm to yasm	13 years ago
Diego Biurrun	7bb3a302fe	build: Consistently handle conditional compilation for all optimization OBJS.	13 years ago
Diego Biurrun	ad0e31f134	build: prettyprinting cosmetics	13 years ago
Diego Biurrun	915a2a0a65	x86: conditionally compile H.264 QPEL optimizations	13 years ago
Christophe GISQUET	34454c761f	SBR DSP x86: implement SSE sbr_sum_square_sse The 32bits targets have been compiled with -mfpmath=sse for proper reference. sbr_sum_square C /32bits: 82c (unrolled)/102c C /64bits: 69c (unrolled)/82c SSE/32bits: 42c SSE/64bits: 31c Use of SSE4.1 dpps to perform the final sum is slower. Not unrolling to perform 8 operations in a loop yields 10 more cycles. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Ronald S. Bultje	7e4d9d5d45	win64: add a XMM clobber test configure option. This will be useful to test more aggressively for failures to mark XMM registers as clobbered in Win64 builds, and prevent regressions thereof. Based on a patch by Ramiro Polla <ramiro.polla@gmail.com>	13 years ago
Christophe Gisquet	e5c9de2ab7	rv40: x86 SIMD for biweight Provide MMX, SSE2 and SSSE3 versions, with a fast-path when the weights are multiples of 512 (which is often the case when the values round up nicely). *_TIMER report for the 16x16 and 8x8 cases: C: 9015 decicycles in 16, 524257 runs, 31 skips 2656 decicycles in 8, 524271 runs, 17 skips MMX: 4156 decicycles in 16, 262090 runs, 54 skips 1206 decicycles in 8, 262131 runs, 13 skips MMX on fast-path: 2760 decicycles in 16, 524222 runs, 66 skips 995 decicycles in 8, 524252 runs, 36 skips SSE2: 2163 decicycles in 16, 262131 runs, 13 skips 832 decicycles in 8, 262137 runs, 7 skips SSE2 with fast path: 1783 decicycles in 16, 524276 runs, 12 skips 711 decicycles in 8, 524283 runs, 5 skips SSSE3: 2117 decicycles in 16, 262136 runs, 8 skips 814 decicycles in 8, 262143 runs, 1 skips SSSE3 with fast path: 1315 decicycles in 16, 524285 runs, 3 skips 578 decicycles in 8, 524286 runs, 2 skips This means around a 4% speedup for some sequences. Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Diego Biurrun	91bafb52ae	x86: Give RV40 init file a more suitable name.	13 years ago
Ronald S. Bultje	59f474b49d	png: convert DSP functions to yasm.	13 years ago
Ronald S. Bultje	e92003514d	png: move DSP functions to their own DSP context.	13 years ago
Christophe GISQUET	3faa303a47	rv34: DC-only inverse transform When decoding coefficients, detect whether the block is DC-only, and take advantage of this knowledge to perform DC-only inverse transform. This is achieved by: - first, changing the 108x4 element modulo_three_table into a 108 element table (kind of base4), and accessing each value using mask and shifts. - then, checking low bits for 0 (as they represent the presence of higher frequency coefficients) Also provide x86 SIMD code for the DC-only inverse transform. Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>	13 years ago
Vitor Sessak	39df0c434c	mpegaudiodec: optimized iMDCT transform Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Diego Biurrun	30bbd5cbc0	x86: conditionally compile dnxhd encoder optimizations	13 years ago
Diego Biurrun	88b9735753	build: conditionally compile x86 H.264 chroma optimizations	13 years ago
Vitor Sessak	22e25c002e	mpegaudiodec: add SSE-optimized imdct36() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	e8b891b7f0	dirac: enable diracdsp_mmx Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
multiple authors	5d50fcc549	DIRAC Decoder stable version, MMX support removed. Look for MMX_DISABLED to find the disabled functions. Authors of this code are Marco Gerards <marco@gnu.org> and David Conrad <lessen42@gmail.com> With changes from Jordi Ortiz <nenjordi@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Kieran Kunhya	44d27736fc	Add V210 SIMD Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Elvis Presley	bebaf4ea1f	prores: change license to LGPL, merge some parts. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Ronald S. Bultje	e3f530feca	prores: idct sse2/sse4 optimizations. ~3.0-3.5x as fast as original C version, 1.6x as fast overall.	13 years ago
Kostya Shishkov	d241f51e0f	Move RV3/4-specific DSP functions into their own context Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Daniel Kang	9bfa5363da	H.264: Add x86 assembly for 10-bit H.264 qpel functions. Mainly ported from 8-bit H.264 qpel. Some code ported from x264. LGPL ok by author. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	14 years ago
Daniel Kang	84e70ef004	h264: Add x86 assembly for 10-bit weight/biweight H.264 functions. Mainly ported from 8-bit H.264 weight/biweight. Signed-off-by: Diego Biurrun <diego@biurrun.de>	14 years ago
Daniel Kang	f188a1e0ca	H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions. Mainly ported from 8-bit H.264 MC Chroma. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	14 years ago
Daniel Kang	a8d44f9dd5	Add x86 assembly for some 10-bit H.264 intra predict functions. Parts are inspired from the 8-bit H.264 predict code in Libav. Other parts ported from x264 with relicensing permission from author. Signed-off-by: Diego Biurrun <diego@biurrun.de>	14 years ago
Daniel Kang	836f47d34b	Add IDCT functions for 10-bit H.264. Ports the majority of IDCT functions for 10-bit H.264. Parts are inspired from 8-bit IDCT code in Libav; other parts ported from x264 with relicensing permission from author. Signed-off-by: Ronald S. Bultje <rbultje@google.com>	14 years ago
Vitor Sessak	3758eb0eb9	dct32: port SSE 32-point DCT to YASM	14 years ago
Mans Rullgard	0b5e44ed29	mpegaudiodsp: fix x86 and ppc makefiles Signed-off-by: Mans Rullgard <mans@mansr.com>	14 years ago
Jason Garrett-Glaser	9f3d6ca4f1	Port x86 10-bit H.264 deblock asm from x264	14 years ago
Baptiste Coudurier	6d4c49a2af	Move png mmx functions into x86/png_mmx.c, remove them from DSPContext. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	14 years ago
Mans Rullgard	a5444fee06	Add CONFIG_AC3DSP symbol to simplify makefiles Signed-off-by: Mans Rullgard <mans@mansr.com>	14 years ago

1 2 3

110 Commits (89715a3cf187c271f7cf4c230b23cd6f6d638e32)