FFmpeg

Commit Graph

Author	SHA1	Message	Date
Diego Biurrun	fd502f4f5f	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler. (Cherry-picked from libav commit `39e208f4d4`) Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
Diego Biurrun	3cba09e522	x86: Drop stray semicolons after function definitions libavcodec/x86/rv40dsp_init.c:97:2: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic] libavcodec/x86/vp9dsp_init.c:94:40: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]	8 years ago
Diego Biurrun	e4a94d8b36	h264chroma: Change type of stride parameters to ptrdiff_t This avoids SIMD-optimized functions having to sign-extend their stride argument manually to be able to do pointer arithmetic.	8 years ago
Ganesh Ajjanagadde	38f4e973ef	all: fix -Wextra-semi reported on clang This fixes extra semicolons that clang 3.7 on GNU/Linux warns about. These were trigggered when built under -Wpedantic, which essentially checks for strict ISO compliance in numerous ways. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>	9 years ago
Ganesh Ajjanagadde	4f90818ea1	avcodec/x86/rv40dsp_init: silence -Wunused-variable on --disable-mmx This silences -Wunused-variable when compiled with --disable-mmx, e.g http://fate.ffmpeg.org/log.cgi?time=20150919094617&log=compile&slot=x86_64-archlinux-gcc-disable-mmx. The alternative of header guards will make it far too ugly. Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Christophe Gisquet	238db7cc56	x86: lavc: use LOCAL_ALIGNED instead of DECLARE_ALIGNED The later may yield incorrect code for on-stack variables. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Diego Biurrun	7fb993d338	qpeldsp: Mark source pointer in qpel_mc_func function pointer const	10 years ago
Michael Niedermayer	33f83a2157	avcodec/x86/rv40dsp_init: fix () in macros Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	86ae0da60c	x86: hpeldsp: propagate changes across codecs Some codecs still use mmx versions, so have them use the versions with newer instruction sets. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Diego Biurrun	82dd1026cf	x86: dsputil: Move hpeldsp-related declarations to a separate header	11 years ago
Diego Biurrun	e998b56362	x86: avcodec: Consistently structure CPU extension initialization	11 years ago
Diego Biurrun	cd52917237	x86: rv40dsp: Move inline assembly optimizations out of YASM init section	11 years ago
Diego Biurrun	3ac7fa81b2	Consistently use "cpu_flags" as variable/parameter name for CPU flags	11 years ago
Diego Biurrun	1399931d07	x86: dsputil: Rename dsputil_mmx.h --> dsputil_x86.h The header is not (anymore) MMX-specific.	12 years ago
Diego Biurrun	63bac48f73	x86: dsputil: Move rv40-specific functions where they belong	12 years ago
Luca Barbato	a8b6015823	dsputil: convert remaining functions to use ptrdiff_t strides Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	12 years ago
Diego Biurrun	82bd04b170	rv34: Drop now unnecessary dsputil dependencies	12 years ago
Diego Biurrun	c9f933b5b6	Add av_cold attributes to arch-specific init functions	12 years ago
Diego Biurrun	26301caaa1	x86: mmx2 ---> mmxext in asm constructs	12 years ago
Janne Grunau	f101eab1be	x86: call most of the x86 dsp init functions under if (ARCH_X86) Rename the called dsp init functions to *_init_x86.	12 years ago
Diego Biurrun	e0c6cce447	x86: Replace checks for CPU extensions and flags by convenience macros This separates code relying on inline from that relying on external assembly and fixes instances where the coalesced check was incorrect.	12 years ago
Diego Biurrun	ec36aa6944	x86: Fix linking with some or all of yasm, mmx, optimizations disabled Some optimized template functions reference optimized symbols, so they must be explicitly disabled when those symbols are unavailable.	12 years ago
Diego Biurrun	a886b279a0	x86: cosmetics: Comment some #endifs for better readability	12 years ago
Martin Storsjö	1d9c2dc89a	Don't include common.h from avutil.h Signed-off-by: Martin Storsjö <martin@martin.st>	12 years ago
Diego Biurrun	239fdf1b4a	x86: build: replace mmx2 by mmxext Refactoring mmx2/mmxext YASM code with cpuflags will force renames. So switching to a consistent naming scheme beforehand is sensible. The name "mmxext" is more official and widespread and also the name of the CPU flag, as reported e.g. by the Linux kernel.	12 years ago
Ronald S. Bultje	79195ce565	x86/dsputil: put inline asm under HAVE_INLINE_ASM. This allows compiling with compilers that don't support gcc-style inline assembly. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	12 years ago
Diego Biurrun	a5a93fa8f5	cosmetics: do not use full path for local headers	13 years ago
Michael Niedermayer	3b196bb737	libavcodec/x86/rv40dsp_init.c: add missing HAVE_YASM Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Kostylev	6797d1948b	x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions.	13 years ago
Christophe Gisquet	110d0cdc9d	rv40dsp x86: MMX/MMX2/3DNow/SSE2/SSSE3 implementations of MC Code mostly inspired by vp8's MC, however: - its MMX2 horizontal filter is worse because it can't take advantage of the coefficient redundancy - that same coefficient redundancy allows better code for non-SSSE3 versions Benchmark (rounded to tens of unit): V8x8 H8x8 2D8x8 V16x16 H16x16 2D16x16 C 445 358 985 1785 1559 3280 MMX* 219 271 478 714 929 1443 SSE2 131 158 294 425 515 892 SSSE3 120 122 248 387 390 763 End result is overall around a 15% speedup for SSSE3 version (on 6 sequences); all loop filter functions now take around 55% of decoding time, while luma MC dsp functions are around 6%, chroma ones are 1.3% and biweight around 2.3%. Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Christophe GISQUET	272b252c01	rv40dsp: implement prescaled versions for biweight. Quite often, the original weights are multiple of 512. By prescaling them by 1/512 when they are computed (once per frame), no intermediate shifting is needed, and no prescaling on each call either. The x86 code already used that trick. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago
Ronald S. Bultje	3ab9a2a557	rv34: change most "int stride" into "ptrdiff_t stride". This prevents having to sign-extend on 64-bit systems with 32-bit ints, such as x86-64. Also fixes crashes on systems where we don't do it and arguments are not in registers, such as Win64 for all weight functions.	13 years ago
Christophe Gisquet	e5c9de2ab7	rv40: x86 SIMD for biweight Provide MMX, SSE2 and SSSE3 versions, with a fast-path when the weights are multiples of 512 (which is often the case when the values round up nicely). *_TIMER report for the 16x16 and 8x8 cases: C: 9015 decicycles in 16, 524257 runs, 31 skips 2656 decicycles in 8, 524271 runs, 17 skips MMX: 4156 decicycles in 16, 262090 runs, 54 skips 1206 decicycles in 8, 262131 runs, 13 skips MMX on fast-path: 2760 decicycles in 16, 524222 runs, 66 skips 995 decicycles in 8, 524252 runs, 36 skips SSE2: 2163 decicycles in 16, 262131 runs, 13 skips 832 decicycles in 8, 262137 runs, 7 skips SSE2 with fast path: 1783 decicycles in 16, 524276 runs, 12 skips 711 decicycles in 8, 524283 runs, 5 skips SSSE3: 2117 decicycles in 16, 262136 runs, 8 skips 814 decicycles in 8, 262143 runs, 1 skips SSSE3 with fast path: 1315 decicycles in 16, 524285 runs, 3 skips 578 decicycles in 8, 524286 runs, 2 skips This means around a 4% speedup for some sequences. Signed-off-by: Diego Biurrun <diego@biurrun.de>	13 years ago
Diego Biurrun	91bafb52ae	x86: Give RV40 init file a more suitable name.	13 years ago
Diego Biurrun	c30b198381	x86: Place mm_flags variable declaration below the appropriate #ifdef. This fixes some unused variable warnings with YASM disabled.	13 years ago
Kostya Shishkov	d241f51e0f	Move RV3/4-specific DSP functions into their own context Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>	13 years ago

1 2

52 Commits (5659f7404731415c7e1cfdf4d8b0afeb6b1132de)