FFmpeg

Commit Graph

Author	SHA1	Message	Date
Andreas Rheinhardt	262e7439c6	avcodec/x86/Makefile: Don't build empty files simple_idct.asm is 32 bit-only since `bfb28b5ce8`, whereas simple_idct10.asm is x64-only. So don't build the ultimately unneeded and empty files, as some linkers complain about this: "ranlib: file: libavcodec/libavcodec.a(simple_idct.o) has no symbols" (this is from an Xcode toolchain as reported by Ronald S. Bultje). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Lynne	b85e106d5f	libavcodec: remove mdct15 It's not needed nor used by anything anymore, lavu/tx is faster, and better in every way. RIP.	2 years ago
Andreas Rheinhardt	4209216ee8	avcodec/mpegvideodsp: Make MpegVideoDSP MPEG-4 only It is only used by gmc/gmc1 which is only used by the MPEG-4 decoder, so move it to Mpeg4DecContext and rename it to Mpeg4VideoDSP. Also compile it iff the MPEG-4 decoder is compiled. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2 years ago
Lynne	3ade6a8644	x86/lpc: implement a new Welch windowing function Old one was written with the assumption only even inputs would be given. This very messy replacement supports even and odd inputs, and supports AVX2 for extra speed. The buffers given are usually quite big (4k samples), so the speedup is worth it. The new SSE version is still faster than the old inline asm version by 33%. Also checkasm is provided to make sure this monstrosity works. This fixes some FATE tests.	2 years ago
Andreas Rheinhardt	6c4595190e	avcodec/flacdsp: Split encoder-only parts into a ctx of its own Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Paul B Mahol	b69c91bbee	avcodec/x86: add cfhdenc SIMD	4 years ago
Paul B Mahol	389cc142fb	avcodec/cfhd: add x86 SIMD Overall speed changes for 1920x1080, yuv422p10le, 60fps from: 0.19x to 0.343x	5 years ago
James Almer	58d167bcd5	avcodec/Makefile: add missing pngdsp dependency to the lscr decoder Signed-off-by: James Almer <jamrial@gmail.com>	6 years ago
Lynne	605e330310	x86/opusdsp: implement FMA3 accelerated postfilter and deemphasis 58893 decicycles in deemphasis_c, 130548 runs, 524 skips 9475 decicycles in deemphasis_fma3, 130686 runs, 386 skips -> 6.21x speedup 24866 decicycles in postfilter_c, 65386 runs, 150 skips 5268 decicycles in postfilter_fma3, 65505 runs, 31 skips -> 4.72x speedup Total decoder speedup: ~14% Deemphasis SIMD based on the following unrolling: const float c1 = CELT_EMPH_COEFF, c2 = c1c1, c3 = c2c1, c4 = c3c1; float state = coeff; for (int i = 0; i < len; i += 4) { y[0] = x[0] + c1state; y[1] = x[1] + c2state + c1x[0]; y[2] = x[2] + c3state + c1x[1] + c2x[0]; y[3] = x[3] + c4state + c1x[2] + c2x[1] + c3*x[0]; state = y[3]; y += 4; x += 4; }	6 years ago
Lynne	5468c1d075	celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabled The entire function was defined away before.	6 years ago
Lynne	4a2c651620	x86/opus_dsp: rename to celt_pvq Its only used in the encoder and in CELT's PVQ.	6 years ago
Aurelien Jacobs	f1e490b1ad	sbcenc: add MMX optimizations This was originally based on libsbc, and was fully integrated into ffmpeg. Rough speed test: C version: speed= 592x MMX version: speed= 785x	7 years ago
Martin Vignali	9b8c1224d7	libavcodec/exr : add X86 SIMD for reorder_pixels Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
Ivan Kalvachev	7205513f8f	SIMD opus pvq_search implementation Explanation on the workings and methods used by the Pyramid Vector Quantization Search function could be found in the following Work-In-Progress mail threads: http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212146.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212816.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213030.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213436.html Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>	8 years ago
Paul B Mahol	4ed7c2bbc3	avcodec/utvideodec: add SIMD for restore_rgb_planes Signed-off-by: Paul B Mahol <onemda@gmail.com>	8 years ago
Rostislav Pehlivanov	e1120b1c54	mdct15: add assembly optimizations for the 15-point FFT c: 1802 decicycles in fft15,16774635 runs, 2581 skips avx: 865 decicycles in fft15,16776378 runs, 838 skips Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	8 years ago
Diego Biurrun	fd502f4f5f	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler. (Cherry-picked from libav commit `39e208f4d4`) Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
James Darnley	8e89f6fd37	avcodec/x86: move simple_idct to external assembly	8 years ago
Ronald S. Bultje	c9d98c5649	cavs: convert idct from inline asm to yasm.	8 years ago
Clément Bœsch	40ac226014	lavc/x86/hevc: rename hevc_res_add to hevc_add_res This will simplify incoming merge.	8 years ago
Diego Biurrun	39e208f4d4	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler.	8 years ago
James Almer	cf9ef83960	huffyuvencdsp: move shared functions to a new lossless_videoencdsp context Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
Pierre Edouard Lepere	6d5636ad9a	hevc: x86: Add add_residual() SIMD optimizations Initially written by Pierre Edouard Lepere <Pierre-Edouard.Lepere@insa-rennes.fr>, extended by James Almer <jamrial@gmail.com>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	8 years ago
Rostislav Pehlivanov	d2ae5f77c6	aacenc: add SIMD optimizations for abs_pow34 and quantization Performance improvements: quant_bands: with: 681 decicycles in quant_bands, 8388453 runs, 155 skips without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips Around 42% for the function Twoloop coder: abs_pow34: with/without: 7.82s/8.17s Around 4% for the entire encoder Both: with/without: 7.15s/8.17s Around 12% for the entire encoder Fast coder: abs_pow34: with/without: 3.40s/3.77s Around 10% for the entire encoder Both: with/without: 3.02s/3.77s Around 20% faster for the entire encoder Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: James Almer <jamrial@gmail.com>	8 years ago
Clément Bœsch	a692724c58	vp9lpf/x86: add x86 SSSE3/AVX SIMD for vp9_loop_filter_[vh]_16_16. Signed-off-by: Anton Khirnov <anton@khirnov.net>	9 years ago
Justin Ruggles	b57e38f52c	ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm Adds a wrapper function for downmixing which detects channel count changes and updates the selected downmix function accordingly. Simplification and porting to current x86inc infrastructure by Diego Biurrun. Signed-off-by: Diego Biurrun <diego@biurrun.de>	9 years ago
Anton Khirnov	12004a9a7f	audiodsp/x86: yasmify vector_clipf_sse	9 years ago
Anton Khirnov	89466de4ae	vp9/x86: rename vp9dsp to vp9mc It only contains the MC SIMD, other SIMD will go into different files.	9 years ago
James Almer	efc9d5c4bc	x86/ttaenc: add ff_ttaenc_filter_process_{ssse3,sse4} Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
Diego Biurrun	1dfc3cf89d	x86: hpeldsp: Split off VP3-specific bits into a separate file	9 years ago
James Almer	fca3c3b619	hevc: Add AVX2 DC IDCT Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>. Integrated to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	9 years ago
Diego Biurrun	01621202aa	build: miscellaneous cosmetics Restore alphabetical order in lists, break overly long lines, do some prettyprinting, add some explanatory section comments, group parts together that belong together logically.	9 years ago
Diego Biurrun	1a094af638	fft: Split MDCT bits off from FFT	9 years ago
Timothy Gu	e3461197b1	x86/vc1dsp: Split the file into MC and loopfilter	9 years ago
Diego Biurrun	15a24614ae	build: Add vc1dsp component for more fine-grained dependencies	9 years ago
James Almer	8ae7447941	x86/dcadec: add ff_lfe_fir0_float_{sse,sse2,avx,fma3} Up to ~4 times faster on x86_64, ~8 times on x86_32 if compiling using x87 fp math. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
Timothy Gu	9fd6ea933f	dirac_dwt: Make x86 files/functions names consistent	9 years ago
Timothy Gu	17ab8f7e68	diracdsp: Make x86 files/functions names consistent	9 years ago
foo86	ae5b2c5250	avcodec/dca: add new decoder based on libdcadec	9 years ago
foo86	4608996772	avcodec/dca: remove old decoder Remove all files and functions which are not going to be reused, and disable all functions and FATE tests temporarily which will be.	9 years ago
James Almer	209f50e16b	avcodec/synth_filter: split off remaining code from dcadec files Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
Diego Biurrun	03ef89faf2	x86: build: Group all encoder objects together	9 years ago
Anton Khirnov	e7078e842d	hevcdsp: add x86 SIMD for MC	9 years ago
James Almer	73353af6e5	x86/Makefile: move decoder/encoder objects out of the subsystems section Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
Timothy Gu	6b41b44149	huffyuvencdsp: Convert ff_diff_bytes_mmx to yasm Heavily based upon ff_add_bytes by Christophe Gisquet. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Timothy Gu <timothygu99@gmail.com>	9 years ago
Ronald S. Bultje	1c3be32533	vp9: add 10/12bpp mmxext-optimized iwht_iwht_4x4 function.	9 years ago
Christophe Gisquet	4369b9dc7b	x86: simple_idct(_put): 10bits versions Modeled from the prores version. Clips to [0;1023] and is bitexact. Bitexactness requires to add offsets in different places compared to prores or C, and makes the function approximately 2% slower. For 16 frames of a DNxHD 4:2:2 10bits test sequence: C: 60861 decicycles in idct, 1048205 runs, 371 skips sse2: 27567 decicycles in idct, 1048216 runs, 360 skips avx: 26272 decicycles in idct, 1048171 runs, 405 skips The add version is not implemented, so the corresponding dsp function is set to NULL to make it clear in a code executing it. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Paul B Mahol	35af7add6f	avcodec/takdec: add x86 SIMD for rest of decorrelation modes Signed-off-by: Paul B Mahol <onemda@gmail.com>	10 years ago
James Almer	72254b19b8	x86/alacdsp: add simd optimized functions Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
Ronald S. Bultje	26ece7a511	vp9: 16bpp tm/dc/h/v intra pred simd (mostly sse2) functions.	10 years ago

1 2 3 4 5 ...

332 Commits (5d97ba5d9c01e478fce2159af49552e9fffde1c0)