FFmpeg

Commit Graph

Author	SHA1	Message	Date
Ronald S. Bultje	bde73f28af	mpegaudio: bury inline asm under HAVE_INLINE_ASM.	12 years ago
Ronald S. Bultje	30b45d9c38	x86inc: automatically insert vzeroupper for YMM functions.	12 years ago
Ronald S. Bultje	a1878a88a1	vp3: don't use calls to inline asm in yasm code. Mixing yasm and inline asm is a bad idea, since if either yasm or inline asm is not supported by your toolchain, all of the asm stops working. Thus, better to use either one or the other alone. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	12 years ago
Ronald S. Bultje	79195ce565	x86/dsputil: put inline asm under HAVE_INLINE_ASM. This allows compiling with compilers that don't support gcc-style inline assembly. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	12 years ago
Yang Wang	845e92fd6a	dsputil_mmx: fix incorrect assembly code In ff_put_pixels_clamped_mmx(), there are two assembly code blocks. In the first block (in the unrolled loop), the instructions "movq 8%3, %%mm1 \n\t", and so forth, have problems. From above instruction, it is clear what the programmer wants: a load from p + 8. But this assembly code doesn’t guarantee that. It only works if the compiler puts p in a register to produce an instruction like this: "movq 8(%edi), %mm1". During compiler optimization, it is possible that the compiler will be able to constant propagate into p. Suppose p = &x[10000]. Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction becomes "movq 810000(%edx)". That is, it will stride by 810000 instead of 8. This will cause a segmentation fault. This error was fixed in the second block of the assembly code, but not in the unrolled loop. How to reproduce: This error is exposed when we build using Intel C++ Compiler, with IPO+PGO optimization enabled. Crashed when decoding an MJPEG video. Signed-off-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	12 years ago
yang	6a2bad2c4f	dsputil_mmx: fix incorrect assembly code In file libavcodec/x86/dsputil_mmx.c, function ff_put_pixels_clamped_mmx(), there are two assembly code blocks. In the first block (in the unrolled loop), the instructions "movq 8%3, %%mm1 \n\t" etc have problem. For above instruction, it is clear what the programmer wants: a load from p + 8. But this assembly code doesn’t guarantee that. It only works if the compiler puts p in a register to produce an instruction like this: “movq 8(%edi), %mm1”. During compiler optimization, it is possible that the compiler will be able to constant propagate into p. Suppose p = &x[10000]. Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction becomes “movq 810000(%edx)”. That is, it will stride by 810000 instead of 8. This will cause the segmentation fault. This error was fixed in the second block of the assembly code, but not in the unrolled loop. How to reproduce: This error is exposed when we build the ffmpeg using Intel C++ Compiler, IPO+PGO optimization. The ffmpeg was crashed when decoding a mjpeg video. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Jason Garrett-Glaser	85a3c19ed1	dsputil: x86: add SHUFFLE_MASK_W macro Simplifies pshufb masks that operate on words.	12 years ago
Diego Biurrun	9f97af2688	x86: dsputil: drop some unused CPU flag debug code	13 years ago
Mans Rullgard	28f9ab7029	vp3: move idct and loop filter pointers to new vp3dsp context This moves all VP3-specific function pointers from dsputil to a new vp3dsp context. There is no reason to ever use the VP3 IDCT where an MPEG2 IDCT is expected or vice versa. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	ab9f987661	build: add CONFIG_VP3DSP, reduce repetition in OBJS lists Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Loren Merritt	e14052dbc8	x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Martin Storsjö	f27386cdc7	x86: h264_intrapred: Don't add the 'd' suffix to the SPLATB_REG macro The SPLATB_REG macro already adds the 'd' suffix internally. This fixes building on Win64, which has been broken since `878e66902`. This worked for unix, where r2 happened to be rdx in this case, which with the first suffix rdxd was mapped to eax, and eaxd is defined back to eax. On win64 however, r2 happened to be R8 in this case, and R8d mapps to R8D just fine, but there's no mapping for R8Dd to anything. Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Diego Biurrun	878e669029	x86: h264_intrapred: use newly introduced SPLAT* and PSHUFLW macros	13 years ago
Loren Merritt	4d4752366f	x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macros Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Diego Biurrun	d20f133ef9	x86: h264_intrapred: port to cpuflag macros	13 years ago
Martin Storsjö	07eeeb1d4f	vp8: Add ifdef guards around the sse2 loopfilter in the sse2slow branch too This was missed in the the previous commit in `70a1c800`. Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Martin Storsjö	70a1c8000f	vp8: loopfilter >=sse2 functions need aligned stack on x86-32. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Ronald S. Bultje	723b266d72	dsputilenc: group yasm and inline asm function pointer assignment.	13 years ago
Ronald S. Bultje	ceabc13f12	dsputilenc_mmx: split assignment of ff_sse16_sse2 to SSE2 section.	13 years ago
Ronald S. Bultje	66a02159ea	x86: fmtconvert: add special asm for float_to_int16_interleave_misc_* This gets rid of a variable-length array and a for loop in C code. Signed-off-by: Martin Storsjö <martin@martin.st>	13 years ago
Mans Rullgard	f2fd167835	x86: vc1: fix and enable optimised loop filter The problem is that the ssse3 psign instruction does the wrong thing here. Commit `ea60dfe` incorrectly removed a macro emulating this instruction for pre-ssse3 code. However, the emulation is incorrect, and the code relies on the behaviour of the macro. Specifically, the psign sets destination elements to zero where the corresponding source element is zero, whereas the emulation only negates destination elements where the source is negative. Furthermore, the PSIGNW_MMX macro in x86util.asm is totally bogus, which is why the original VC-1 code had an additional right shift when using it. Since the psign instruction cannot be used here, skip all the macro hell and use the working instruction sequence directly. None of this was noticed due a stray return statement in ff_vc1dsp_init_mmx() which meant that only the mmx version of the loop filter was ever used (before being removed in `ea60dfe`). Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Christophe Gisquet	a5bfa66df5	x86: fft: replace call to memcpy by a loop The function call was a mess to handle, and memcpy cannot make the assumptions we do in the new code. Tested on an IMC sample: 430c -> 370c. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	37c3864ef7	x86: fft: elf64: fix PIC build In a 64-bit PIC build, external functions must be called through the PLT. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Nicolas George	d4c45b8adf	Revert "Revert "x86: fft: win64: fix stack alignment for memcpy() call"" This reverts commit `f767658414`. The bug it introduces has been fixed.	13 years ago
Nicolas George	91765594dd	Revert "Revert "x86: fft: convert sse inline asm to yasm"" This reverts commit `fd91a3ec44`. The bug it introduced has been fixed.	13 years ago
Nicolas George	fd91a3ec44	Revert "x86: fft: convert sse inline asm to yasm" This reverts commit `8299260470`. It breaks shared builds on x86_64.	13 years ago
Nicolas George	f767658414	Revert "x86: fft: win64: fix stack alignment for memcpy() call" This reverts commit `8725da49a2`. Necerrary to revert `8299260470`.	13 years ago
Mans Rullgard	0595334892	x86: fft: elf64: fix PIC build In a 64-bit PIC build, external functions must be called through the PLT. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	8725da49a2	x86: fft: win64: fix stack alignment for memcpy() call	13 years ago
Mans Rullgard	8299260470	x86: fft: convert sse inline asm to yasm	13 years ago
Ronald S. Bultje	8123e0901f	x86: place some inline asm under #if HAVE_INLINE_ASM Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	0b6f973635	h264: use asm cabac reader under a generic condition This removes a dependency on implementation details from generic code and allows easy addition of the equivalent optimisation for other architectures than x86. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Diego Biurrun	fe07c9c6b5	x86: Only use optimizations with cmov if the CPU supports the instruction	13 years ago
Mans Rullgard	29686d6ea3	x86: remove unused inline asm macros from dsputil_mmx.h Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Mans Rullgard	685f5438bb	x86: move some inline asm macros to the only places they are used Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Michael Niedermayer	fba18ef8cc	x86/dsputil_mmx: support 4 sample edges Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Diego Biurrun	a5a93fa8f5	cosmetics: do not use full path for local headers	13 years ago
Ronald S. Bultje	d9669eab0b	dwt: remove variable-length arrays Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Michael Niedermayer	9946a6aa55	diracdsp: try to fix segfault This might fix Ticket1412 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	3b196bb737	libavcodec/x86/rv40dsp_init.c: add missing HAVE_YASM Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	915ec91e6b	libavcodec/x86/h264dsp_mmx.c: add forgotten HAVE_YASM Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	63bfee8796	libavcodec/x86/dwt.c: move some missed things under HAVE_YASM Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Justin Ruggles	d5a7229ba4	Add a float DSP framework to libavutil Move vector_fmul() from DSPContext to AVFloatDSPContext.	13 years ago
Vitor Sessak	bac0729d9e	x86: use new schema for ASM macros Signed-off-by: Janne Grunau <janne-libav@jannau.net>	13 years ago
Vitor Sessak	2fd5e70869	x86: use new schema for ASM macros Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Carl Eugen Hoyos	001d9d5e93	Fix compilation with --disable-everything.	13 years ago
Justin Ruggles	713548cbad	x86: lavc: use %if HAVE_AVX guards around AVX functions in yasm code. This is needed for older versions of yasm/nasm that do not support AVX. Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Kieran Kunhya	5ff01259a8	Convert vector_fmul range of functions to YASM and add AVX versions Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>	13 years ago
Michael Kostylev	6797d1948b	x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions.	13 years ago
Justin Ruggles	95a98ab3f0	ac3dsp: simplify x86 versions of ac3_max_msb_abs_int16 Simplifies the code by using cpuflags and a new macro. Also fixes the invalid use of the MMX2 pshufw operation in the MMX-only function.	13 years ago

1 2 3 4 5 ...

925 Commits (ca28cb5f8388e87751644999cb6350068987549b)