FFmpeg

Commit Graph

Author	SHA1	Message	Date
Michael Niedermayer	65e33d8e23	swresample/resample_template: Add filter values in parallel This is faster 2871 -> 2189 cycles for int16 matrixbench -> 23456hz Fixes a integer overflow in a artificial corner case Fixes part of 668007-media Found-by: Matt Wolenetz <wolenetz@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	8 years ago
Michael Niedermayer	34db650784	swresample/resample_template: Reorder operations to avoid one addition Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	8 years ago
Muhammad Faiz	b8c6e5a661	swresample: add exact_rational option give high quality resampling as good as with linear_interp=on as fast as without linear_interp=on tested visually with ffplay ffplay -f lavfi "aevalsrc='sin(10000tt)', aresample=osr=48000, showcqt=gamma=5" ffplay -f lavfi "aevalsrc='sin(10000tt)', aresample=osr=48000:linear_interp=on, showcqt=gamma=5" ffplay -f lavfi "aevalsrc='sin(10000tt)', aresample=osr=48000:exact_rational=on, showcqt=gamma=5" slightly speed improvement for fair comparison with -cpuflags 0 audio.wav is ~ 1 hour 44100 stereo 16bit wav file ffmpeg -i audio.wav -af aresample=osr=48000 -f null - old new real 13.498s 13.121s user 13.364s 12.987s sys 0.131s 0.129s linear_interp=on old new real 23.035s 23.050s user 22.907s 22.917s sys 0.119s 0.125s exact_rational=on real 12.418s user 12.298s sys 0.114s possibility to decrease memory usage if soft compensation is ignored Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>	9 years ago
James Almer	43482bd1a5	swr/resample: use av_clip functions Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Michael Niedermayer	0cb95f9082	swresample/resample_template: Add () to protect the arguments of the OUT() macro Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
James Almer	857cd1f33b	swr: initialize only the necessary resample dsp functions Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	23a9edf531	Partially revert "swr: add prototypes for resample dsp functions" Prototypes are not needed anymore now that the x86 functions don't include resample_template.c The DO_RESAMPLE_ONE macro is removed for that same reason as well. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	dd2c9034b1	x86/swr: convert resample_{common, linear}_double_sse2 to yasm Signed-off-by: James Almer <jamrial@gmail.com> 312531 -> 311528 dezicycles Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	847bb638c0	swr: convert resample_common/linear_int16_mmx2/sse2 to yasm. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	418e5768c6	swresample/resample_template: move division out of loop for float/double swri_resample_linear() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	c5a405c4f0	swresample/resample_template: flip order of operations in swri_resample_linear() for 32bit Fixes integer overflow Found-by: BBB Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	faa1471ffc	swr: rewrite resample_common/linear_float_sse/avx in yasm. Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm) cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/ sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to 32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to 38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm 5.1 and gcc 4.8/9). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	0dae193d3e	swr: remove another forgotten division in DSP function. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	cbf21628a5	swr: remove div/mod from DSP functions. Also fix a bug with resample_compensation resetting dst_incr. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	edf930472b	swr: reindent. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	7f4dfbd080	swr: add prototypes for resample dsp functions Should fix compilation failures with MSVC and any other compiler without inline asm support. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	7128a35f8c	swr: split out DSP functions. DSP bits of swri_resample go into their own mini-DSP functions; DSP init goes from a per-call branch in multiple_resample to a proper DSP init routine; x86 bits go into x86/; swri_resample() moves out of resample_template.c into resample.c because it's independent of DSP code or sample type; multiple_resample() is simplified. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	b785c62681	swr: handle initial negative sample index outside DSP function. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	6b9685de3a	swr: remove unnecessary assignment. I don't see dst_incr/dst_incr_frac ever being changed from their initial value (which is the inverse of this operation), so it seems to me that this is a no-op. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	f341340552	swr: handle 64bit overflow check in multiple_resample(). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	cdfd9717ed	swr: move compensation_distance handling to swri_resample caller. I think there's an off-by-one in terms of the switchpoint where we switch from dst_incr to ideal_dst_incr, I don't think that's a massive issue, but just be aware of that. It's probably trivial to prevent but I don't care. Signed-off-by: Michael Niedermayer <michaelni@gmx.at> I could not reproduce any off by 1 error, results are bit exact (michael)	11 years ago
Michael Niedermayer	2c23f87c85	swr/resample_template: prevent end_index from overflowing and add check for delta_frac overflow Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Ronald S. Bultje	9b53853756	Rewrite main resampling loop (common and linear). This removes a branch at a performance-sensitive point (in the middle of the loop). In fate-swr-resample-s32p-8000-2626, this makes the code about 10% faster. It also simplifies the loops, allowing us to rewrite it in yasm at some later point. The compensation_distance != 0 code and index < 0 code are still kind of hairy. For compensation_distance != 0, this should likely be handled in the caller, so that it calls swri_resample twice (once until the dst_incr switch-point, and once with the remainder of the samples). For index < 0, the code should probably be rewritten to break out of the loop once sample_index >= 0, and then resume (e.g. as a tail-call) to the common or linear resampling loops. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	a9bf713d35	swresample: add swri_resample_float_avx Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	cdac3ab59f	swresample: add swri_resample_double_sse2 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	2b58c9c945	swresample/resample_template: try to consider src_size more exactly This should avoid slight differences in the output causes by input size alignment differences between archs Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	5e379cd3ee	swresample/resample: simplify index/consumed calculation for the filter = 1 case Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	6c8ee74af2	swresample/resample: Fix fractional part of index in the filter_size = 1 filters = 1 case Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	63dbba655e	swresample/resample: sse float linear interpolation About two times faster Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	fa25c4c400	swresample/resample: mmx2/sse2 int16 linear interpolation About three times faster Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	32291ba6ea	swresample: add swri_resample_float_sse At least two times faster than the C version. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	3d48cbc56c	swresample: reuse COMMON_CORE asm where possible Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	7c8bf09edd	swresample: change COMMON_CORE_INT16 asm from SSSE3 to SSE2 pshuf+paddd is slightly faster than phaddd. The real gain is in pre-ssse3 processors like AMD K8 and K10, which get a big boost in performance compared to the mmxext version Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	b8c55590d5	swr/resample: fix integer overflow, add missing cast The effects of this are limited to numeric errors in the output Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	b6a7f66f93	resample: remove disabled debug code Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Clément Bœsch	8ea8833979	swr/resample: move templating parameters to template itself. It has various benefits such as allowing some refactoring, clarifying the code in the inclusion part, and making the template understandable in standalone. This commit is based on the templating method used by Justin Ruggles for libavresample.	12 years ago
Michael Niedermayer	d53f447130	swr: move if() block into the only branch where it can be true. This should make the code a tiny tiny bit faster. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	17da2d9eee	swr: reorder/redesign operations to avoid integer overflow. This fixes a out of array read. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	12 years ago
Michael Niedermayer	4ccf6e3971	swr: MMX2 & SSSE3 int16 resample core about 4 times faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	0c142e4cda	swr: introduce filter_alloc in preparation of SIMD resample optimisations Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	80e857c967	swr/resample: optimize C code for the most common case 15% speedup Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	6e6dd9995b	resample_template: use av_assert Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago
Michael Niedermayer	7f1ae79d38	swr: support float & int32 in the resampler Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	13 years ago

43 Commits (65892516d52c268bd66ef825c4b1c8050a69d732)