FFmpeg

Commit Graph

Author	SHA1	Message	Date
Diego Biurrun	b89804da9b	x86: videodsp: Add parentheses to expression to work around warning libavcodec/x86/videodsp.asm:128: warning: signed dword value exceeds bounds	8 years ago
Rostislav Pehlivanov	d2ae5f77c6	aacenc: add SIMD optimizations for abs_pow34 and quantization Performance improvements: quant_bands: with: 681 decicycles in quant_bands, 8388453 runs, 155 skips without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips Around 42% for the function Twoloop coder: abs_pow34: with/without: 7.82s/8.17s Around 4% for the entire encoder Both: with/without: 7.15s/8.17s Around 12% for the entire encoder Fast coder: abs_pow34: with/without: 3.40s/3.77s Around 10% for the entire encoder Both: with/without: 3.02s/3.77s Around 20% faster for the entire encoder Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: James Almer <jamrial@gmail.com>	8 years ago
Diego Biurrun	6be7944ee2	x86: Add missing colons after assembly labels This fixes many warnings of the sort warning: label alone on a line without a colon might be in error	8 years ago
Alexandra Hájková	112cee0241	hevc: Add SSE2 and AVX IDCT Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Anton Khirnov	e4128c08d7	Revert "hevc: x86: Refactor IDCT macro declarations" This reverts commit `d9dccc0389`. There were outstanding objections to this commit.	8 years ago
Diego Biurrun	5801f9ed24	h264_intrapred: x86: Update comments left behind in `95c89da36e`	8 years ago
Diego Biurrun	d9dccc0389	hevc: x86: Refactor IDCT macro declarations	8 years ago
Ronald S. Bultje	715f139c9b	vp9lpf/x86: make filter_16_h work on 32-bit. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	8915320db9	vp9lpf/x86: make filter_48/84/88_h work on 32-bit. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	725a216481	vp9lpf/x86: make filter_44_h work on 32-bit. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	5bfa96c4b3	vp9lpf/x86: make filter_16_v work on 32-bit. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	b905e8d2fe	vp9lpf/x86: make filter_48/84_v work on 32-bit. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	37637e6590	vp9lpf/x86: make filter_88_v work on 32-bit. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	be10834bd9	vp9lpf/x86: make filter_44_v work on 32-bit. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	7c62891efe	vp9lpf/x86: save one register in SIGN_ADD/SUB. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	c6375a83d1	vp9lpf/x86: store unpacked intermediates for filter6/14 on stack. filter16 goes from 508 to 482 (h) or 346 to 314 (v) cycles; filter88 goes from 240 to 238 (h) or 174 to 165 (v) cycles, measured on TOS. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	4ce8ba72f9	vp9lpf/x86: move variable assigned inside macro branch. The value is not used outside the branch. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	e4961035b2	vp9lpf/x86: simplify ABSSUM_CMP by inverting the comparison meaning. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	683da2788e	vp9lpf/x86: remove unused register from ABSSUB_CMP macro. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	6e74e9636b	vp9lpf/x86: slightly simplify 44/48/84/88 h stores. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	6411c328a2	vp9lpf/x86: make cglobal statement more conservative in register allocation. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Ronald S. Bultje	a6e288d624	vp9lpf/x86: save one register in loopfilter surface coverage. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Clément Bœsch	0ed21bdc9e	vp9lpf/x86: add ff_vp9_loop_filter_[vh]_44_16_{sse2,ssse3,avx}. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Clément Bœsch	f2e3d706a1	vp9lpf/x86: add ff_vp9_loop_filter_h_{48,84}_16_{sse2,ssse3,avx}(). Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
James Almer	92d47550ea	vp9lpf/x86: add an SSE2 version of vp9_loop_filter_[vh]_88_16 Similar gains as the ssse3 version once again Additional improvements by Clément Bœsch <u@pkh.me>. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Clément Bœsch	6bea478158	vp9lpf/x86: add ff_vp9_loop_filter_[vh]_88_16_{ssse3,avx}. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
James Almer	1f451eed60	vp9lpf/x86: add ff_vp9_loop_filter_[vh]_16_16_sse2(). Similar gains in performance as the SSSE3 version Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
Clément Bœsch	a692724c58	vp9lpf/x86: add x86 SSSE3/AVX SIMD for vp9_loop_filter_[vh]_16_16. Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
James Almer	42111e8543	avcodec: fix arguments on xmm/neon clobber test wrappers Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
James Almer	449f263f9f	avcodec: add missing xmm/neon clobber test wrappers for the new encode API Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
Justin Ruggles	b57e38f52c	ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm Adds a wrapper function for downmixing which detects channel count changes and updates the selected downmix function accordingly. Simplification and porting to current x86inc infrastructure by Diego Biurrun. Signed-off-by: Diego Biurrun <diego@biurrun.de>	8 years ago
Justin Ruggles	43717469f9	ac3dsp: Reverse matrix in/out order in downmix() Also use (float *) instead of (float ()[2]). This matches the matrix layout in libavresample so we can reuse assembly code between the two. Signed-off-by: Diego Biurrun <diego@biurrun.de>	8 years ago
Hendrik Leppkes	8d1267932c	x86/h264_weight: use appropriate register size for weight parameters This fixes decoding corruption on 64 bit windows. Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago
Diego Biurrun	2caa93b813	mpegaudiodsp: Change type of array stride parameters to ptrdiff_t This avoids SIMD-optimized functions having to sign-extend their stride argument manually to be able to do pointer arithmetic.	8 years ago
Diego Biurrun	e4a94d8b36	h264chroma: Change type of stride parameters to ptrdiff_t This avoids SIMD-optimized functions having to sign-extend their stride argument manually to be able to do pointer arithmetic.	8 years ago
Diego Biurrun	2ec9fa5ec6	idct: Change type of array stride parameters to ptrdiff_t ptrdiff_t is the correct type for array strides and similar.	8 years ago
Diego Biurrun	009adfd4fb	x86: fpel: Remove unnecessary sign extend	8 years ago
Anton Khirnov	de2ae3c1fa	lavc: add clobber tests for the new encoding/decoding API	9 years ago
Hendrik Leppkes	5ae0ad001a	x86/h264_weight: use appropriate register size for weight parameters Fixes trac 5579 Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Acked-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Michael Niedermayer	bc26fe8927	avcodec/h264: Use ptrdiff_t for (bi)weight functions Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Anton Khirnov	12004a9a7f	audiodsp/x86: yasmify vector_clipf_sse	9 years ago
Anton Khirnov	eea9857bfd	blockdsp: drop the high_bit_depth parameter It has no effect, since the code is supposed to operate the same way for any bit depth.	9 years ago
Anton Khirnov	683da86aab	audiodsp: reorder arguments for vector_clipf This will make the x86 asm simpler. ARM conversion by Martin Storsjö <martin@martin.st> and Janne Grunau <janne-libav@jannau.net>	9 years ago
Anton Khirnov	75d98e30af	audiodsp/x86: clear the high bits of the order parameter on 64bit Also change shl to add, since it can be faster on some CPUs. CC: libav-stable@libav.org	9 years ago
Anton Khirnov	1d6c76e11f	audiodsp/x86: fix ff_vector_clip_int32_sse2 This version, which is the only one doing two processing cycles per loop iteration, computes the load/store indices incorrectly for the second cycle. CC: libav-stable@libav.org	9 years ago
Diego Biurrun	de452e5037	pixblockdsp: Change type of stride parameters to ptrdiff_t This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic. Also adjust parameter names to be "stride" everywhere.	9 years ago
Diego Biurrun	721d57e608	vp56: Separate VP5 and VP6 dsp initialization VP5 has no arch-specific optimizations (nor will it get some in the future), so it makes no sense to try to share dsp init code with VP6.	9 years ago
Diego Biurrun	3fd22538bc	prores: Change type of stride parameters to ptrdiff_t This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic. Also adjust parameter names to be "linesize" everywhere.	9 years ago
Diego Biurrun	f81be06cf6	cavs: Change type of stride parameters to ptrdiff_t ptrdiff_t is the correct type for array strides and similar.	9 years ago
Diego Biurrun	802727b538	vp8: Update some assembly comments left unchanged in `bd66f073fe`	9 years ago

... 3 4 5 6 7 ...

2469 Commits (b0e2e938c31f0dc46d905cb2ea7e904645ca0c19)