FFmpeg

Commit Graph

Author	SHA1	Message	Date
Christopher Degawa	182663a58a	get_cabac_inline_x86: Don't inline the assembly function on 32 bit While the inline cabac assembly has worked correctly in i386 builds historically, modern compiler updates has started showing issues with it, when the function gets inlined into larger contexts that fail to provide the amount of free registers as this function requires. This was an issue with Clang on Windows on i386, which was fixed in c6d284b945324a7bc70ea8b9056040c8148aa835. However, recently the same issues also have started showing up with GCC (both for Windows and Linux). Whether the issue appears seems dependent on a lot of optimizer tuning (e.g. the issue appears or goes away depenent on the combinaton of -march= and -mtune= options), potentially due to the compiler making different decisions on how much to inline. Fixes: https://trac.ffmpeg.org/ticket/8903 Signed-off-by: Martin Storsjö <martin@martin.st>	2 years ago
Lynne	bbe95f7353	x86: replace explicit REP_RETs with RETs From x86inc: > On AMD cpus <=K10, an ordinary ret is slow if it immediately follows either > a branch or a branch target. So switch to a 2-byte form of ret in that case. > We can automatically detect "follows a branch", but not a branch target. > (SSSE3 is a sufficient condition to know that your cpu doesn't have this problem.) x86inc can automatically determine whether to use REP_RET rather than REP in most of these cases, so impact is minimal. Additionally, a few REP_RETs were used unnecessary, despite the return being nowhere near a branch. The only CPUs affected were AMD K10s, made between 2007 and 2011, 16 years ago and 12 years ago, respectively. In the future, everyone involved with x86inc should consider dropping REP_RETs altogether.	2 years ago
James Darnley	6af453ca38	avcodec/x86: add avx512icl function for v210dec Ice Lake (Xeon Silver 4316): 2.01x faster (1147±36.8 vs. 571±38.2 decicycles) compared with avx2	2 years ago
James Darnley	f30b4c2f47	avcodec/x86/v210: add some comments to the improved avx2 function	2 years ago
Andreas Rheinhardt	262e7439c6	avcodec/x86/Makefile: Don't build empty files simple_idct.asm is 32 bit-only since `bfb28b5ce8`, whereas simple_idct10.asm is x64-only. So don't build the ultimately unneeded and empty files, as some linkers complain about this: "ranlib: file: libavcodec/libavcodec.a(simple_idct.o) has no symbols" (this is from an Xcode toolchain as reported by Ronald S. Bultje). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
James Darnley	5dfb4f9690	avcodec/x86/v210enc: change '0b' binary constant prefix to 'b' suffix For compatability with yasm from 0.7.0	2 years ago
James Darnley	690b7890f0	avcodec/x86/v210enc: remove unneeded instruction	2 years ago
James Darnley	c67a2b14a2	avcodec/x86/v210enc: expand and correct comments	2 years ago
James Darnley	651cb867b1	avcodec/v210enc: add new 10-bit function for avx512 avx512icl avx512 on Skylake-X (Xeon D-2123IT): 1.19x faster (970±91.2 vs. 817±104.4 decicycles) compared with avx2 avx512icl on Ice Lake (Xeon Silver 4316): 2.52x faster (1350±5.3 vs. 535±9.5 decicycles) compared with avx2	2 years ago
James Darnley	bda53d2dde	avcodec/x86/v210enc: replace register use with named register	2 years ago
Andreas Rheinhardt	4228f8ad49	avcodec/x86/cavsdsp: Remove unused 3DNow-macro Forgotten in `3221aba879`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Lynne	b85e106d5f	libavcodec: remove mdct15 It's not needed nor used by anything anymore, lavu/tx is faster, and better in every way. RIP.	2 years ago
Lynne	e0661fc805	dca_core: convert to lavu/tx Thanks to Martin Storsjö <martin@martin.st> for fixing and testing the arm32 and aarch64 changes.	2 years ago
James Darnley	c3d36e1b3d	avcodec/v210enc: add new function for avx2 avx512 avx512icl Negligible speed difference for avx2 on Zen 2 (Ryzen 5700X) and Broadwell (Xeon E5-2620 v4): 1690±4.3 decicycles vs. 1693±78.4 1439±31.1 decicycles vs 1429±16.7 Moderate speedup with avx512 on Skylake-X (Xeon D-2123IT): 1.22x faster (793±0.8 vs. 649±5.5 decicycles) compared with avx2 Better speedup with avx512icl on Ice Lake (Xeon Silver 4316): 1.77x faster (784±1.8 vs. 442±11.6 decicycles) compared with avx2 Co-authors: Henrik Gramner <henrik@gramner.com> Kieran Kunhya <kierank@obe.tv>	2 years ago
Andreas Rheinhardt	4209216ee8	avcodec/mpegvideodsp: Make MpegVideoDSP MPEG-4 only It is only used by gmc/gmc1 which is only used by the MPEG-4 decoder, so move it to Mpeg4DecContext and rename it to Mpeg4VideoDSP. Also compile it iff the MPEG-4 decoder is compiled. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	e84348a8ab	avcodec/svq1enc: Add SVQ1EncDSPContext, make codec context private Currently, SVQ1EncContext is defined in a header that is also included by the arch-specific code that initializes the one and only dsp function that this encoder uses directly. But the arch-specific functions to set this dsp function do not need anything from SVQ1EncContext. This commit therefore adds a small SVQ1EncDSPContext whose only member is said function pointer and renames svq1enc.h to svq1encdsp.h to avoid exposing unnecessary internals to these init functions (and the whole mpegvideo with it). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Carl Eugen Hoyos	60e87faf7f	lavc/x86/simple_idct: Fix linking shared libavcodec with MS link.exe link.exe hangs on empty simple_idct.o Fixes ticket #9909.	2 years ago
Andreas Rheinhardt	1741adb1c7	avcodec/huffyuvencdsp: Pass pix_fmt directly when initing dsp It is the only thing that is actually used. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	76d8f0dd14	avcodec/ac3dsp: Remove unused parameter Forgotten in `fd98594a88`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	4393331250	avcodec/dirac_dwt: Avoid conversions between function pointers and void* Pointers to void can be converted to any pointer to incomplete or object type and back; but they are nevertheless not completely generic pointers: There is no provision in the C standard that guarantees their convertibility with function pointers. C90 lacks a generic function pointer, C99 made every function pointer a generic function pointer and still disallows the convertibility with void *. Both GCC as well as Clang warn about this when using -pedantic. Therefore use unions to avoid these conversions. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
James Almer	0922c6b01b	x86/lpc: use fused negative multiply-add instructions where useful Signed-off-by: James Almer <jamrial@gmail.com>	2 years ago
James Almer	0627e6d74c	avcodec/lpc: zero the middle odd sample in the output Signed-off-by: James Almer <jamrial@gmail.com>	2 years ago
James Almer	c8c4a162fc	avcodec/lpc: use ptrdiff_t for length parameters Signed-off-by: James Almer <jamrial@gmail.com>	2 years ago
James Almer	48615f0a78	x86/aacpsdsp: add ps_hybrid_analysis_fma3 This replace the sse3 version, which was not really faster than the sse one. Signed-off-by: James Almer <jamrial@gmail.com>	2 years ago
James Almer	2bcf86d53d	x86/aacpsdsp: precompute constant factors Inspired by the optimization done to the C version by Rémi Denis-Courmont. Signed-off-by: James Almer <jamrial@gmail.com>	2 years ago
Martin Storsjö	c9aa6164d4	x86/lpc: Fix parameter sign extension, unbreaking checkasm-lpc on x86_64 windows Signed-off-by: Martin Storsjö <martin@martin.st>	2 years ago
Lynne	b67776e12f	x86/lpc: fix even scalar loop overreads/writes Passes checkasm with valgrind, tested to sizes of more than 4000 samples.	2 years ago
Lynne	dea944b838	x86/lpc: fix odd scalar loop overreads/writes	2 years ago
Andreas Rheinhardt	9beba05311	avcodec/fmtconvert: Remove unused AVCodecContext parameter Unused since `d74a8cb7e4`. Reviewed-by: Rémi Denis-Courmont <remi@remlab.net> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	fd72d8aea3	avcodec/blockdsp: Remove unused AVCodecContext parameter Possible since `be95df12bb`. Reviewed-by: Rémi Denis-Courmont <remi@remlab.net> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	57f3ca20dc	avcodec/cavsdsp: Remove unused function parameter Reviewed-by: Rémi Denis-Courmont <remi@remlab.net> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Lynne	3ade6a8644	x86/lpc: implement a new Welch windowing function Old one was written with the assumption only even inputs would be given. This very messy replacement supports even and odd inputs, and supports AVX2 for extra speed. The buffers given are usually quite big (4k samples), so the speedup is worth it. The new SSE version is still faster than the old inline asm version by 33%. Also checkasm is provided to make sure this monstrosity works. This fixes some FATE tests.	2 years ago
Rémi Denis-Courmont	b52034270a	lavc/vorbisdsp: use ptrdiff_t rather than intptr_t ... for a difference between pointers.	2 years ago
Paul B Mahol	37a503ac87	avcodec/x86/audiodsp: add scalarproduct avx2	2 years ago
Andreas Rheinhardt	a54e53a1c4	avcodec/vp8dsp: Constify src in vp8_mc_func Reviewed-by: Peter Ross <pross@xvid.org> Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Andreas Rheinhardt	ad12e31b03	avcodec/x86/flacdsp_init: Remove double ';' Inside a function, the second ';' in ";;" is just a null statement, but it is actually illegal outside of functions. Compilers nevertheless accept it without warning, except when in -pedantic mode when e.g. Clang emits a -Wextra-semi warning. Therefore remove the unnecessary ';'. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Paul B Mahol	1e202d89c9	avcodec/x86/flacdsp: fix bug in decorrelation Fixes #9297	2 years ago
Andreas Rheinhardt	0bb0c26799	avutil/mem_internal: Fix headers Including avassert.h is unnecessary since commit `786be70e28`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Martin Storsjö	dc55e63578	x86: Don't hardcode the height to 8 in sad8_xy2_mmx The height is hardcoded in some of the me_cmp functions, but not in all of them. But in the case of all other functions, it's hardcoded in the same place in SIMD functions as in the C reference functions, while this one function differs from the behaviour of the C code. (Before `542765ce3e`, there were a couple other sad8_*_mmx functions with similar hardcoded height.) Signed-off-by: Martin Storsjö <martin@martin.st>	3 years ago
Andreas Rheinhardt	6c4595190e	avcodec/flacdsp: Split encoder-only parts into a ctx of its own Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	3a869cd5cd	avcodec/flacdsp: Remove unused function parameter Forgotten in `e609cfd697`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	333b32af8e	avcodec/h264chroma: Constify src in h264_chroma_mc_func Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	b3bbbb14d0	avcodec/hevcdsp: Constify src pointers Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	209a11053f	avcodec/mpegvideodsp: Constify src pointers Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	966fc1230a	avcodec/mpegvideoencdsp: Allow pointers to const where possible Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	abb85429f3	avcodec/me_cmp: Constify me_cmp_func buffer parameters Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	e7cb7c762a	avcodec/cfhdencdsp: Constify input pointers Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	dc3e25e4d3	avcodec/lossless_videoencdsp: Constify src sub_left_predict Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	af43da3e4d	avcodec/videodsp: Constify buf in VideoDSPContext.prefetch Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	7ab9b30800	avcodec/vp56: Move VP5-9 range coder functions to a header of their own Also use a vpx prefix for them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago

1 2 3 4 5 ...

2593 Commits (dd7e30724b739af9642917b1d04ba56d12e5e13f)