FFmpeg

Author	SHA1	Message	Date
Lynne	bbe95f7353	x86: replace explicit REP_RETs with RETs From x86inc: > On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either > a branch or a branch target. So switch to a 2-byte form of ret in that case. > We can automatically detect "follows a branch", but not a branch target. > (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.) x86inc can automatically determine whether to use REP_RET rather than REP in most of these cases, so impact is minimal. Additionally, a few REP_RETs were used unnecessary, despite the return being nowhere near a branch. The only CPUs affected were AMD K10s, made between 2007 and 2011, 16 years ago and 12 years ago, respectively. In the future, everyone involved with x86inc should consider dropping REP_RETs altogether.	2 years ago
Andreas Rheinhardt	abb85429f3	avcodec/me_cmp: Constify me_cmp_func buffer parameters Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	542765ce3e	avcodec/x86/me_cmp: Remove obsolete MMX(EXT) functions x64 always has MMX, MMXEXT, SSE and SSE2 and this means that some functions for MMX, MMXEXT and 3dnow are always overridden by other functions (unless one e.g. explicitly disables SSE2) for x64. So given that the only systems that benefit from these functions are truely ancient 32bit x86s they are removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
James Almer	844bef578e	avcodec/x86: add missing colon to labels Silences warnings with Nasm Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Almer	33c752be51	x86/me_cmp: port mmxext vsad functions to yasm Also add mmxext versions of vsad8 and vsad_intra8, and sse2 versions of vsad16 and vsad_intra16. Since vsad8 and vsad16 are not bitexact, they are accordingly marked as approximate. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
James Almer	77f9a81cca	x86/me_cmp: combine sad functions into a single macro No point in having the sad8 functions separate now that the loop is no longer unrolled. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
Michael Niedermayer	85f2c0124d	avcodec/x86/me_cmp: fix sad8xh This adds back support for 8x4 and 8x16 it does not support 8x2, i think nothing uses that Found-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	0456d169c4	x86/me_cmp: port mmxext and sse2 sad functions to yasm Also add a missing c->pix_abs[0][0] initialization, and sse2 versions of sad16_x2, sad16_y2 and sad16_xy2 (%15 to %20 faster than mmxext). Since the _xy2 versions are not bitexact, they are accordingly marked as approximate. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	2d60444331	dsputil: Split motion estimation compare bits off into their own context	11 years ago
Diego Biurrun	f46bb608d9	dsputil: Split off pixel block routines into their own context	11 years ago
Diego Biurrun	c166148409	dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc	11 years ago
Timothy Gu	108dec3055	x86: dsputilenc: convert hf_noise_mmx to yasm Signed-off-by: Timothy Gu <timothygu99@gmail.com> Several bugfixes by: Christophe Gisquet <christophe.gisquet@gmail.com> See: [FFmpeg-devel] [WIP] [PATCH 4/4] x86: dsputilenc: convert hf_noise_mmx to yasm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Timothy Gu	154cee9292	x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasm Signed-off-by: Timothy Gu <timothygu99@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	02a3e327f1	x86/dsputilenc: add missing guards to ff_pix_sum16_xop XOP support was added in Yasm 1.0.0 and Nasm 2.06, and we still support older versions. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	05de4d3011	x86/dsputilenc: implement XOP version of pix_sum16 SSE2: 137 cycles XOP: 87 cycles Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	b50559fc0b	libavcodec/x86/dsputilenc: drop and 0xffff that should have becomei redundant Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	561bfc85eb	x86/dsputilenc: implement SSE2 versions of pix_{sum16, norm1} Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	5863207086	x86/dsputilenc: use HADDD in ff_sse16_sse2 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	e64e079ece	x86/dsputilenc: implement SSE2 version of diff_pixels Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	a0c5cd3475	avcodec/x86/dsputilenc: set the count of SSE registers correctly for get_pixels Found-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	a3950a90f6	Revert "x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasm" This reverts commit `ad733089b0`. breaks with --disable-yasm revert requested by: Christophe Gisquet <christophe.gisquet@gmail.com>	11 years ago
Timothy Gu	ad733089b0	x86: dsputilenc: convert ff_sse{8, 16}_mmx() to yasm Signed-off-by: Timothy Gu <timothygu99@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	d94e255dd1	x86/dsputilenc: make the SUM_ABS_DCTELEM macro more readable Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	61eea421b2	x86/dsputilenc: port sum_abs_dctelem functions to yasm Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	82bb304801	dsputil: Use correct type in me_cmp_func function pointer	11 years ago
Diego Biurrun	55519926ef	x86: Make function prototype comments in assembly code consistent This helps grepping for functions, among other things.	11 years ago
Diego Biurrun	88bd7fdc82	Drop DCTELEM typedef It does not help as an abstraction and adds dsputil dependencies. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	12 years ago
Daniel Kang	9f00b1cbab	dsputilenc: x86: Convert pixel inline asm to yasm Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	12 years ago
Diego Biurrun	51969a652c	x86: ABS2: port to cpuflags	12 years ago
Diego Biurrun	5b4dfbffc2	x86: ABS1: port to cpuflags	12 years ago
Diego Biurrun	9b15c0a9b3	x86: dsputilenc: port to cpuflags	12 years ago
Diego Biurrun	26301caaa1	x86: mmx2 ---> mmxext in asm constructs	12 years ago
Diego Biurrun	588fafe7f3	x86: MMX2 ---> MMXEXT in macro names	12 years ago
Diego Biurrun	04581c8c77	x86: yasm: Use complete source path for macro helper %includes This is more consistent with the way we handle C #includes and it simplifies the build system.	12 years ago
Diego Biurrun	6860b4081d	x86: include x86inc.asm in x86util.asm This is necessary to allow refactoring some x86util macros with cpuflags.	12 years ago
Diego Biurrun	3b9e832e17	x86: Drop silly "_yasm" suffixes from filenames	13 years ago
Mans Rullgard	a3df4781f4	x86: add colons after labels nasm prints a warning if the colon is missing. Signed-off-by: Mans Rullgard <mans@mansr.com>	13 years ago
Ronald S. Bultje	3b15a6d742	config.asm: change %ifdef directives to %if directives. This allows combining multiple conditionals in a single statement.	13 years ago
Kieran Kunhya	b1766c170c	Move x264asm to libavutil. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	14 years ago
Dave Yeo	cc73511e8e	Fix NASM include directive Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	14 years ago
Ronald S. Bultje	b2c087871d	Move x86util.asm from libavcodec/ to libavutil/. This allows using it in swscale also.	14 years ago
Ronald S. Bultje	3a39195b1d	Move x86inc.asm to libavutil/. This allows using it in libswscale/ also.	14 years ago
Daniel Kang	d0005d347d	Modify x86util.asm to ease transitioning to 10-bit H.264 assembly. Arguments for variable size instructions are added to many macros, along with other various changes. The x86util.asm code was ported from x264. Signed-off-by: Diego Biurrun <diego@biurrun.de>	14 years ago
Diego Biurrun	888fa31eca	Fix FSF address copy paste error in some license headers.	14 years ago
Mans Rullgard	2912e87a6c	Replace FFmpeg with Libav in licence headers Signed-off-by: Mans Rullgard <mans@mansr.com>	14 years ago
Ronald S. Bultje	ada65af9d1	Don't access upper 32 bits of a 32-bit int on 64-bit systems. Originally committed as revision 25140 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Ronald S. Bultje	e2e341048e	Move hadamard_diff{,16}_{mmx,mmx2,sse2,ssse3}() from inline asm to yasm, which will hopefully solve the Win64/FATE failures caused by these functions. Originally committed as revision 25137 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago
Ronald S. Bultje	d0acc2d2e9	Move sse16_sse2() from inline asm to yasm. It is one of the functions causing Win64/FATE issues. Originally committed as revision 25136 to svn://svn.ffmpeg.org/ffmpeg/trunk	15 years ago

11 Commits (0cb733d276477185d2983005851e92c0bf9946e0)