FFmpeg

Commit Graph

Author	SHA1	Message	Date
Diego Biurrun	efc7290eb6	x86: hpeldsp: Keep all rnd_template instantiations in hpeldsp_init There is no point in having a separate file just for the instantiation that provides the public functions.	11 years ago
Diego Biurrun	aba70bb538	Add missing headers to make template files compile (more) standalone	11 years ago
Diego Biurrun	d0aabeab23	x86: h264_qpel: Fix typo in CALL_2X_PIXELS macro invocation This fixes FATE with mmxext CPUFLAGS set.	11 years ago
Peter Ross	a490970af2	libavcodec/*/vp8dsp_init: indent Signed-off-by: Peter Ross <pross@xvid.org> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Peter Ross	89f2f5dbd7	On2 VP7 decoder Signed-off-by: Peter Ross <pross@xvid.org> Reviewed-by: BBB previous patch reviewed by jason Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	c25d2cd20b	avcodec/x86/mpegvideoenc_template: fix integer overflow Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	c8246d3766	avcodec/x86/h264_qpel: Fix typo introduced by `322a1dda97` Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	82dd1026cf	x86: dsputil: Move hpeldsp-related declarations to a separate header	11 years ago
Diego Biurrun	6655c933a8	x86: dsputil: Move fpel declarations to a separate header	11 years ago
Diego Biurrun	322a1dda97	dsputil: Refactor duplicated CALL_2X_PIXELS / PIXELS16 macros	11 years ago
Diego Biurrun	600b854ad8	imgconvert: Move ff_deinterlace_line_*_mmx declarations out of dsputil	11 years ago
Diego Biurrun	1a8d0cf77e	x86: dsputil: Move inline assembly macros to a separate header	11 years ago
Matt Oliver	cd5cf395f6	Additional icl inline asm fix. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	1cd107f637	avcodec/x86/snowdsp: add missing clobbers to inner_add_yblock_bw_8_obmc_16_bh_even_sse2() and inner_add_yblock_bw_16_obmc_32_sse2() Note, these functions are currently disabled Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	82bb304801	dsputil: Use correct type in me_cmp_func function pointer	11 years ago
Diego Biurrun	0e083d7e43	build: Group general components separate from de/encoders in arch Makefiles This is in line with how the top-level libavcodec Makefile is structured.	11 years ago
Diego Biurrun	5169e68895	dsputil: Propagate bit depth information to all (sub)init functions This avoids recalculating the value over and over again.	11 years ago
Carl Eugen Hoyos	57fdc74c34	Add one forgotten named inline asm operand in libavcodec/x86/motion_est.c.	11 years ago
Matt Oliver	8236747511	Automatically change MANGLE() into named inline asm operands when direct symbol reference in inline asm are not supported. This is part of the patch-set for intel C inline asm on windows support Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Matt Oliver	b2d3a45598	avcodec/x86/mlpdsp: Only use asm when non-local inline asm lables are supported This is part of the patch-set for intel C inline asm on windows support Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	aa1f38015c	x86/synth_filter: improve FMA version Replace mulps+subps with fnmaddps, resulting in two less instructions inside the inner loops. About 1% faster FMA3 performance. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Matt Oliver	b73aae6fe9	avcodec/x86/idct_sse2_xvid: move offsets out of MANGLE() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Matt Oliver	9eb3f11c55	Add missing external declarations. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Matt Oliver	590805b7c3	Fixed 64bit conformance with mvzbl. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	db3f61a04f	x86: dsputil_init: Drop some unnecessary parentheses	11 years ago
Diego Biurrun	441b093915	x86: dsputil_init: K&R formatting cosmetics	11 years ago
Diego Biurrun	4cb4680c10	x86: dsputil_x86.h: K&R formatting cosmetics	11 years ago
Diego Biurrun	f8bbebecfd	x86: motion_est: K&R formatting cosmetics	11 years ago
Diego Biurrun	a36947c167	dsputilenc_mmx: K&R formatting cosmetics	11 years ago
Diego Biurrun	38675229a8	dsputil_mmx: K&R formatting cosmetics	11 years ago
Diego Biurrun	6a8b35dc88	dsputilenc_mmx: Merge two assignment blocks with identical conditions	11 years ago
Diego Biurrun	55519926ef	x86: Make function prototype comments in assembly code consistent This helps grepping for functions, among other things.	11 years ago
Diego Biurrun	edd1f833fa	x86: h264_idct_10_bit: Use proper type in function prototype comments	11 years ago
Diego Biurrun	831a118078	Update dsputil- and SIMD-related comments to match reality more closely	11 years ago
Diego Biurrun	17608f6ee3	x86: Add some more missing headers	11 years ago
Diego Biurrun	08dba0e1c3	x86: mpegvideoenc: Remove some remnants of the long-gone libmpeg2 IDCT	11 years ago
James Almer	9e0e1f9067	x86/dsputil: add emms to ff_scalarproduct_int16_mmxext() Also undo the changes to ra144enc.c from previous commits. Should fix ticket #3429 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	3bfdee00cd	x86: dcadsp: Fix linking with yasm and optimizations disabled Some optimized functions reference optimized symbols, so the functions must be explicitly disabled when those symbols are unavailable.	11 years ago
Diego Biurrun	3741aa37c2	x86: cabac: Use correct #includes to make header compile standalone	11 years ago
James Almer	7fd64e3e36	x86/synth_filter: add synth_filter_fma3 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	206167a295	x86/synth_filter: add missing HAVE_YASM guard Should fix compilation failures with --disable-yasm on some compilers Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	884e085d1e	x86/synth_filter: Revert the switch to float ops with SSE2 This reverts the changes `6467209836` and `68c3ed936a` did to the SSE2 version, which generated a hit of about 5 cycles. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	68c3ed936a	x86/synth_filter: add synth_filter_avx Sandy Bridge Win64: 180 cycles on ff_synth_filter_inner_sse2 150 cycles on ff_synth_filter_inner_avx Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	6467209836	x86/synth_filter: add synth_filter_sse Build only on x86_32 targets. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	2cdbcc0048	x86: synth filter float: implement SSE2 version Timings for Arrandale: C SSE win32: 2108 334 win64: 1152 322 Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with the jmp destination being aligned. Unrolling for ARCH_X86_64 is a 20 cycles gain. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	169243112c	x86: dcadsp: implement SSE lfe_dir Results for Arrandale/Windows: 32: 1670 -> 316 64: 728 -> 298 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	4cb6964244	dcadec: simplify decoding of VQ high frequencies The vector dequantization has a test in a loop preventing effective SIMD implementation. By moving it out of the loop, this loop can be DSPized. Therefore, modify the current DSP implementation. In particular, the DSP implementation no longer has to handle null loop sizes. The decode_hf implementations have following timings: For x86 Arrandale: C SSE SSE2 SSE4 win32: 260 162 119 104 win64: 242 N/A 89 72 The arm NEON optimizations follow in a later patch as external asm. The now unused check for the y modifier in arm inline asm is removed from configure.	11 years ago
Christophe Gisquet	08e3ea60ff	x86: synth filter float: implement SSE2 version Timings for Arrandale: C SSE win32: 2108 334 win64: 1152 322 Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with the jmp destination being aligned. Unrolling for ARCH_X86_64 is a 20 cycles gain. Signed-off-by: Janne Grunau <janne-libav@jannau.net>	11 years ago
Christophe Gisquet	ad507d7907	x86: dcadsp: implement SSE lfe_dir Results for Arrandale/Windows: 32: 1670 -> 316 64: 728 -> 298 Signed-off-by: Janne Grunau <janne-libav@jannau.net>	11 years ago
Diego Biurrun	b23650491f	prores: Use consistent names for DSP arch initialization functions	11 years ago

1 2 3 4 5 ...

1641 Commits (98a6806fddc2a0e8f402c9ebd7497f4a8d20f536)