FFmpeg

Commit Graph

Author	SHA1	Message	Date
James Almer	c3d2426cca	x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2 ~20% faster than AVX. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
Christophe Gisquet	3e892b2bcd	x86: hevc_mc: split differently calls In some cases, 2 or 3 calls are performed to functions for unusual widths. Instead, perform 2 calls for different widths to split the workload. The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't be processed that way without modifications: some calls use unaligned buffers, and having branches to handle this was resulting in no micro-benchmark benefit. For block_w == 12 (around 1% of the pixels of the sequence): Before: 12758 decicycles in epel_uni, 4093 runs, 3 skips 19389 decicycles in qpel_uni, 8187 runs, 5 skips 22699 decicycles in epel_bi, 32743 runs, 25 skips 34736 decicycles in qpel_bi, 32733 runs, 35 skips After: 11929 decicycles in epel_uni, 4096 runs, 0 skips 18131 decicycles in qpel_uni, 8184 runs, 8 skips 20065 decicycles in epel_bi, 32750 runs, 18 skips 31458 decicycles in qpel_bi, 32753 runs, 15 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Christophe Gisquet	dad7f15567	hevcdsp: remove more instances of compile-time-fixed parameters Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Christophe Gisquet	d4f44b66d3	hevcdsp: remove compilation-time-fixed parameter The dststride parameter is always MAX_PB_SIZE. Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
James Almer	54ca4dd43b	x86/hevc_res_add: refactor ff_hevc_transform_add{16,32}_8 * Reduced xmm register count to 7 (As such they are now enabled for x86_32). * Removed four movdqa (affects the sse2 version only). * pxor is now used to clear m0 only once. ~5% faster. Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Almer	76a99d467f	x86/hecv_res_add: add ff_hevc_transform_add{8,16,32}_8_avx ~15% faster than sse2 Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
Pierre Edouard Lepere	a6af4bf64d	x86: hevc: adding transform_add Reviewed-by: James Almer <jamrial@gmail.com> Approved-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
James Almer	73c4f63ba5	x86/hevc_deblock: add add ff_hevc_[hv]_loop_filter_luma_{8, 10, 12}_avx ~5% faster than SSSE3 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
James Almer	bfb3b2b7a6	x86/hevc_idct: add 12bit idct_dc Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Michael Niedermayer	d4a9e89b27	avcodec/x86/hevcdsp_init: make license header consistent Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
James Almer	1ace9573dc	x86/hevc_idct: replace old and unused idct functions Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial). Benchmarks on an Intel Core i5-4200U: idct8x8_dc SSE2 MMXEXT C cycles 22 26 57 idct16x16_dc AVX2 SSE2 C cycles 27 32 249 idct32x32_dc AVX2 SSE2 C cycles 62 126 1375 Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Pierre Edouard Lepere	1a880b2fb8	hevc: SSE2 and SSSE3 loop filters Additional contributions by James Almer <jamrial@gmail.com>, Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and Anton Khirnov <anton@khirnov.net> Signed-off-by: Anton Khirnov <anton@khirnov.net>	10 years ago
Mickaël Raulet	bd0f2d316f	x86/hevc: add 12bits support for MC cherry picked from commit 3fcb7a4595a6f40100a22110a5805e3b7510c0fd Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Mickaël Raulet	7bdcf5c934	x86/hevc: add 12bits support for deblocking filter cherry picked from commit 97d46afe320c7d61d7b9525e5f5588355cde4bb0 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Christophe Gisquet	670b7f203a	x86: hevcdsp: align Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Michael Niedermayer	ca6b33b8bd	avcodec/x86/hevcdsp_init: Fix "warning: assignment from incompatible pointer type"	10 years ago
James Almer	276bef5340	x86/hevc_deblock: add ff_hevc_[hv]_loop_filter_luma_{8, 10}_sse2 Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Kieran Kunhya <kierank@obe.tv> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
plepere	942e22c651	avcodec/x86/hevc: add avx2 dc idct Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
plepere	92cccb7bcd	avcodec/hevc: new idct + asm Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	09fc28aed1	x86: hevcdsp_init: fix macro usage The macro was not using the parameter but unconditionally using sse4. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
plepere	de7b89fd43	avcodec/x86/hevc: added DBF assembly functions Reviewed-by: James Almer <jamrial@gmail.com> Reviewed-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	341cacb9ac	avcodec/x86/hevcdsp_init: fix build failure with --disable-mmx Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
plepere	63832e01c3	hvcodec/x86/hevcdsp: make macros more modular to support functions that are not sse4 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	fc7d0d8201	avcodec/x86/hevcdsp_init: fix SSE4 checks Found-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	3b3db02f2e	avcodec/x86/hevcdsp_init: fix build on 32bit Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
plepere	7a2491c436	HEVC : added assembly MC functions pretty print x86 Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago

27 Commits (22844132069ebd2c0b2ac4e7b41c93c33890bfb9)