FFmpeg

Commit Graph

Author	SHA1	Message	Date
Loren Merritt	7a1944b907	vf_hqdn3d: x86 asm 13% faster on penryn, 16% on sandybridge, 15% on bulldozer Not simd; a compiler should have generated this, but gcc didn't.	13 years ago
Justin Ruggles	6092dafb5a	lavr: x86: optimized 6-channel s16 to fltp conversion	13 years ago
Mans Rullgard	5b170c0bea	x86: remove FASTDIV inline asm GCC 4.3 and later do the right thing with the plain C code. Earlier versions in 32-bit mode generate one extra instruction, needlessly zeroing what would be the high half of the shifted value. At least two gcc configurations miscompile the inline asm in some situations. In 64-bit mode, all gcc versions generate imul r64, r64 followed by shr. On Intel i7 and later, this imul is faster 32-bit mul. On older Intel and all AMD, it is slightly slower. On Atom it is much slower. Considering where the FASTDIV macro is used, any overall negative performance impact of this change should be negligible. If anyone cares, they should file a bug against gcc and get the instruction selection fixed. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Martin Storsjö	33e112847d	Add more missing includes after removing the implicit common.h Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Martin Storsjö	70766c2182	Add some more missing includes after removing the implicit common.h Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Mans Rullgard	070a402b60	x86: move MANGLE() and related macros to libavutil/x86/asm.h These x86-specific macros do not belong in generic code. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	c318626ce2	x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.h This puts x86-specific things in the x86/ subdirectory where they belong. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Michael Niedermayer	c794acc44e	x86inc.asm: remove redundant ifdef __YASM_VER__ Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Mans Rullgard	edd8226795	x86: fix build with nasm 2.08 It appears that something goes wrong in old nasm versions when the %+ operator is used in the last argument of a macro invocation and this argument is tested with %ifdef within the macro. This patch rearranges the macro arguments such that the %+ operator is never used in the last argument.	13 years ago
Mans Rullgard	180d43bc67	x86: use nop cpu directives only if supported nasm does not support 'CPU foonop' directives. This adds a configure test for the directive and uses it only if supported. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	7238265052	x86: fix rNmp macros with nasm For some reason, nasm requires this. No harm done to yasm. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	a3df4781f4	x86: add colons after labels nasm prints a warning if the colon is missing. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Diego Biurrun	239fdf1b4a	x86: build: replace mmx2 by mmxext Refactoring mmx2/mmxext YASM code with cpuflags will force renames. So switching to a consistent naming scheme beforehand is sensible. The name "mmxext" is more official and widespread and also the name of the CPU flag, as reported e.g. by the Linux kernel.	13 years ago
Diego Biurrun	ca844b7be9	x86: Use consistent 3dnowext function and macro name suffixes Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to "3dnowext", which is a more common name of the CPU flag, as reported e.g. by the Linux kernel, unifies this.	13 years ago
Loren Merritt	f8d8fe255d	x86inc: clip num_args to 7 on x86-32. This allows us to unconditionally set the cglobal num_args parameter to a bigger value, thus making writing yasm code even easier than before. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Ronald S. Bultje	96c9cc1094	x86inc: sync to latest version from x264.	13 years ago
Justin Ruggles	79687079a9	x86: add support for fmaddps fma4 instruction with abstraction to avx/sse	13 years ago
Ronald S. Bultje	30b45d9c38	x86inc: automatically insert vzeroupper for YMM functions.	13 years ago
Jason Garrett-Glaser	85a3c19ed1	dsputil: x86: add SHUFFLE_MASK_W macro Simplifies pshufb masks that operate on words.	13 years ago
Mans Rullgard	e346176de9	x86: cpu: clean up check for cpuid instruction support This adds macros for accessing the EFLAGS register and uses these instead of coding the entire check in inline asm. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Ronald S. Bultje	358d854df8	x86/cpu: implement get/set_eflags using intrinsics Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Ronald S. Bultje	c0ee695bd7	x86/cpu: implement support for cpuid through intrinsics Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Ronald S. Bultje	3f150ffba3	x86/cpu: implement support for xgetbv through intrinsics Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Clément Bœsch	7073174551	x86inc: put basicnop under ifdef to prevent compile failure. This should fix the NASM box. Reviewed-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Ronald S. Bultje	07b287020c	x86/timer: implement an intrinsic-based version for rdtsc (AV_READ_TIME).	13 years ago
Michael Niedermayer	dc12f7d4ec	x86inc: try to put amdnop under ifdef to prevent compile failure based on similar amdnop usage in ffmpeg Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Loren Merritt	4d4752366f	x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Loren Merritt	2cd1f5cadc	x86inc: modify ALIGN to not generate long nops on i586 Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Mans Rullgard	889c1ec4cc	x86: cpu: clean up check for cpuid instruction support This adds macros for accessing the EFLAGS register and uses these instead of coding the entire check in inline asm. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
yang	9b72041f80	x86/intmath.h: Fix mull operand constraints Fixes Ticket1466 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Mans Rullgard	963cdf39b4	x86: cpu: whitespace (mostly) cosmetics This adds whitespace around operators, aligns line continuation backslashes, and breaks long lines. Also fixes an ifdef halfway through a statement. The one line of duplication this saved is not worth the ugliness. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Ronald S. Bultje	8123e0901f	x86: place some inline asm under #if HAVE_INLINE_ASM Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Diego Biurrun	65345a5a30	x86: Add CPU flag for the i686 cmov instruction	13 years ago
Michael Niedermayer	97726e86be	x86/intmath: fix type of FASTDIV Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Justin Ruggles	82b2df9790	float_dsp: add x86-optimized functions for vector_fmac_scalar()	13 years ago
Michael Niedermayer	f0313e9022	x86/float_dsp.asm: restore author attribution The attribution was removed by libav while moving the code to libavutil The original code is from commit `eb4825b5d4` Author: Loren Merritt <lorenm@u.washington.edu> Date: Thu Aug 10 19:06:25 2006 +0000 sse and 3dnow implementations of float->int conversion and mdct windowing. 15% faster vorbis. and commit `069720565c` Author: Loren Merritt <lorenm@u.washington.edu> Date: Fri Aug 11 18:19:37 2006 +0000 vorbis simd tweaks Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Justin Ruggles	d5a7229ba4	Add a float DSP framework to libavutil Move vector_fmul() from DSPContext to AVFloatDSPContext.	13 years ago
Vitor Sessak	4a301706fd	x86: Avoid movs on BUTTERFLYPS when in AVX mode Signed-off-by: Janne Grunau <janne-libav@jannau.net>	13 years ago
Justin Ruggles	5cc6d5244d	lavr: replace the SSE version of ff_conv_fltp_to_flt_6ch() with SSE4 and AVX The current SSE version is slower than the MMX version on Athlon64 and Sandy Bridge, but the SSE4 and AVX versions are faster on Sandy Bridge.	13 years ago
Justin Ruggles	c8af852b97	Add libavresample This is a new library for audio sample format, channel layout, and sample rate conversion.	13 years ago
Reimar Döffinger	9b1f776d75	Fix compilation with NASM. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>	13 years ago
Nico Weber	a4a88fd42c	Remove .rodata alignment kludge for Mach-O if a recent enough yasm is used. Yasm was fixed in its r2161 and yasm 0.8.0 (Apr 2010) contained this fix. Nasm was fixed in 2.06 (Jun 2009): https://groups.google.com/group/alt.lang.asm/browse_thread/thread/fcc85bbc3745d893 I tested with yasm 0.7.99 and yasm 1.2.0.7, where this works fine. I also tested with nasm. The nasm shipping with Xcode is too old to understand ffmpeg's assembly, before and after the patch. Nasm 2.10 fails to compile fft_mmx.asm on trunk with libavcodec/x86/fft_mmx.asm:88: panic: section ".text" has already been specified with alignment 32, conflicts with new alignment of 16 but builds fine if I change the two alignment "16"s in x86inc.asm to "32". With this patch, nasm 2.10 fails with libavcodec/x86/fft_mmx.asm:39: panic: section ".rodata" has already been specified with alignment 32, conflicts with new alignment of 16 instead, but again builds fine with s/16/32/. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Loren Merritt	705f3d4759	x86inc: support AVX abstraction for 2-operand instructions Add cvtdq2ps and cvtps2dq to the AVX instruction list. Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>	13 years ago
Diego Biurrun	baaab6069a	build: Move all arch OBJS declarations into arch subdirectory Makefiles.	13 years ago
Henrik Gramner	729f90e268	x86inc improvements for 64-bit Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments Also (by Ronald S. Bultje) Fix up our asm to work with new x86inc.asm. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>	13 years ago
Ronald S. Bultje	98b9da2ac7	x86inc: add *mp named argument support to DEFINE_ARGS.	13 years ago
Loren Merritt	0f53d0cf4b	x86inc: don't "bake" stack_offset in named arguments. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Reimar Döffinger	b223035511	Detect and check for CMOV. Some MMX-only CPUs do not have support for CMOV. All SSE/MMX2 CPUs should be fine, thus no check was added to those functions. See also https://sourceforge.net/tracker/?func=detail&aid=3358347&group_id=205275&atid=992986 Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>	13 years ago
Haruhiko Yamagata	166f399377	x86inc: support yasm -f win64 flag also. This sets __OUTPUT_FORMAT__ to win64 instead of win32, even though both (through -m amd64) produce 64-bit binary code. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Henrik Gramner	9cf7385309	x86inc: allow manual use of WIN64_SPILL_XMM. Functions using INIT_MMX may still access XMM registers through direct means (xmm0-15). Therefore, they still need to be marked for clobber so they can be properly saved/restored. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago

1 2 3 4 5 ...

303 Commits (927696aab258c7184ceac9765a305b5f91eef8dc)