FFmpeg

Commit Graph

Author	SHA1	Message	Date
Anton Khirnov	e4601cc339	lavc/hevc*: move to hevc/ subdir	9 months ago
Andreas Rheinhardt	39b4b5aad7	avcodec: Remove superfluous ';' outside of functions Inside a function an extra ';' is a null statement; but outside of it it simply must not happen. So remove them. Reviewed-by: Nuo Mi <nuomi2021@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	1 year ago
Andreas Rheinhardt	6106fb2b4c	avcodec/hevcdsp: Offset ff_hevc_.pel_filters to simplify addressing Besides simplifying address computations (it saves 432B of .text in hevcdsp.o alone here) it also fixes undefined behaviour that occurs if mx or my are 0 (happens when the filters are unused) because they lead to an array index of -1 in the old code. This happens in the checkasm-hevc_pel FATE-test. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Reviewed-by: Nuo Mi <nuomi2021@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	1 year ago
James Almer	2dc8221e66	x86/hevcdsp_init.c: fix preprocessor check HAVE_AVX2_EXTERNAL has a value, so check for it. Signed-off-by: James Almer <jamrial@gmail.com>	1 year ago
Wu Jianhua	fc5ff6b0b8	avcodec/x86/h26x/h2656_inter: add dststride to put Signed-off-by: Wu Jianhua <toqsxw@outlook.com>	1 year ago
Wu Jianhua	7d9f1f5485	avcodec/x86/hevc_mc: move put/put_uni to h26x/h2656_inter.asm This enable that the asm optimization can be reused by VVC Signed-off-by: Wu Jianhua <toqsxw@outlook.com>	1 year ago
Andreas Rheinhardt	b3bbbb14d0	avcodec/hevcdsp: Constify src pointers Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	338f8fd232	avcodec/x86/hevcdsp_init: Remove obsolete MMXEXT functions x64 always has MMX, MMXEXT, SSE and SSE2 and this means that some functions for MMX, MMXEXT and 3dnow are always overridden by other functions (unless one e.g. explicitly disables SSE2) for x64. So given that the only systems that benefit from these functions are truely ancient 32bit x86s they are removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Andreas Rheinhardt	5eee930726	avcodec/x86/hevcdsp_init: Remove unnecessary inclusion of get_bits.h This file does not use anything from get_bits.h at all; furthermore hevcdsp.h now includes get_bits.h itself. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	3 years ago
Wu Jianhua	037fa0437d	avcodec/x86/hevc_mc: add qpel_h64_8_avx512icl ff_hevc_put_hevc_qpel_h64_8_sse4 56782981 ff_hevc_put_hevc_qpel_h64_8_avx2 40097816 ff_hevc_put_hevc_qpel_h64_8_avx512icl 25488576 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	3 years ago
Wu Jianhua	68437bf169	avcodec/x86/hevc_mc: add qpel_h32_8_avx512icl ff_hevc_put_hevc_qpel_h32_8_sse4 14122151 ff_hevc_put_hevc_qpel_h32_8_avx2 9337675 ff_hevc_put_hevc_qpel_h32_8_avx512icl 6424654 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	3 years ago
Wu Jianhua	6fbb8cc8ad	avcodec/x86/hevc_mc: add qpel_h4_8_avx512icl ff_hevc_put_hevc_qpel_h4_8_sse4 993694 ff_hevc_put_hevc_qpel_h4_8_avx512icl 686647 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	3 years ago
Wu Jianhua	c1790b60d6	avcodec/x86/hevc_mc: add qpel_h16_8_avx512icl ff_hevc_put_hevc_qpel_h16_8_sse4 3290870 ff_hevc_put_hevc_qpel_h16_8_avx512icl 1730033 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	3 years ago
Wu Jianhua	d4cd8830bd	avcodec/x86/hevc_mc: add qpel_h8_8_avx512icl and qpel_hv8_8_avx512icl This commit uses the instruction `vpdpbusd` introduced by AVX512 VNNI to calculate the horizontal filter. ff_hevc_put_hevc_qpel_h8_8_sse4 1039169 ff_hevc_put_hevc_qpel_h8_8_avx512icl 677153 ff_hevc_put_hevc_qpel_hv8_8_sse4 3603511 ff_hevc_put_hevc_qpel_hv8_8_avx512icl 2995354 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>	3 years ago
Anton Khirnov	c8c2dfbc37	lavu: move LOCAL_ALIGNED from internal.h to mem_internal.h That is a more appropriate place for it.	4 years ago
Clément Bœsch	7c300a8ed4	lavc/hevc: remove a few random spaces to reduce diff with libav	8 years ago
Pierre Edouard Lepere	6d5636ad9a	hevc: x86: Add add_residual() SIMD optimizations Initially written by Pierre Edouard Lepere <Pierre-Edouard.Lepere@insa-rennes.fr>, extended by James Almer <jamrial@gmail.com>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	8 years ago
Alexandra Hájková	112cee0241	hevc: Add SSE2 and AVX IDCT Signed-off-by: Anton Khirnov <anton@khirnov.net>	8 years ago
James Almer	fca3c3b619	hevc: Add AVX2 DC IDCT Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>. Integrated to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>	9 years ago
Diego Biurrun	257b30af8e	x86: hevc: Fix linking with both yasm and optimizations disabled Some optimized functions reference optimized symbols, so the functions must be explicitly disabled when those symbols are unavailable.	9 years ago
James Almer	70d685a77f	x86: use the new helper macros where useful Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
James Almer	d4c47333e1	x86/hevc_sao: add ff_hevc_sao_edge_filter_{8,16}_{10,12} Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
Anton Khirnov	e7078e842d	hevcdsp: add x86 SIMD for MC	9 years ago
Ganesh Ajjanagadde	38f4e973ef	all: fix -Wextra-semi reported on clang This fixes extra semicolons that clang 3.7 on GNU/Linux warns about. These were trigggered when built under -Wpedantic, which essentially checks for strict ISO compliance in numerous ways. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>	9 years ago
Christophe Gisquet	b533949813	x86: hevc: remove a parameter to WP internals The second stride is always the internal buffer one, MAX_PB_SIZE (times 2 to get the value in bytes). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
James Almer	14b44c1614	x86/hevc_sao: make sao_edge_filter_{10,12} work on x86_32 Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Almer	06fe6dfe12	x86/hevc_sao: make sao_band_filter work on x86_32 Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
Christophe Gisquet	5eedd36df1	x86: hevc_mc: use epel_hv 16-wide function The epel_hv functions were still relying on only epel_hv 8-wide being the maximum width instanciated. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
Pierre Edouard Lepere	a0d1300f71	x86: hevc_mc: add AVX2 optimizations before 33304 decicycles in luma_bi_1, 523066 runs, 1222 skips 38138 decicycles in luma_bi_2, 523427 runs, 861 skips 13490 decicycles in luma_uni, 516138 runs, 8150 skips after 20185 decicycles in luma_bi_1, 519970 runs, 4318 skips 24620 decicycles in luma_bi_2, 521024 runs, 3264 skips 10397 decicycles in luma_uni, 515715 runs, 8573 skips Conflicts: libavcodec/x86/hevc_mc.asm libavcodec/x86/hevcdsp_init.c Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	10 years ago
James Almer	15574c505b	x86/hevcdsp: add ff_hevc_sao_edge_filter_{10,12}_{sse2,avx2} Original x86 intrinsics code by Pierre-Edouard Lepere. Yasm port, refactoring and optimizations by James Almer. Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U Width 32 342694 decicycles in sao_edge_filter_10, 16384 runs, 0 skips 29476 decicycles in ff_hevc_sao_edge_filter_32_10_ssse3, 16384 runs, 0 skips 13996 decicycles in ff_hevc_sao_edge_filter_32_10_avx2, 16381 runs, 3 skips Width 64 581163 decicycles in sao_edge_filter_10, 8192 runs, 0 skips 59774 decicycles in ff_hevc_sao_edge_filter_64_10_ssse3, 8192 runs, 0 skips 28383 decicycles in ff_hevc_sao_edge_filter_64_10_avx2, 8191 runs, 1 skips Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Almer	042c1159fc	x86/hevcdsp: add ff_hevc_sao_edge_filter_8_{ssse3,avx2} Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere. Refactoring and optimizations by James Almer. Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U Width 32 158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips 5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips 2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips Width 64 705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips 19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips 10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Almer	fa3eccb4f9	x86/hevc: add ff_hevc_sao_band_filter_{8,10,12}_{sse2,avx,avx2} Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere. 10/12bit yasm ports, refactoring and optimizations by James Almer Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U width 32 40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips 8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips 7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips 4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips width 64 136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips 28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips 26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips 14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	10 years ago
James Almer	c3d2426cca	x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2 ~20% faster than AVX. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
Christophe Gisquet	3e892b2bcd	x86: hevc_mc: split differently calls In some cases, 2 or 3 calls are performed to functions for unusual widths. Instead, perform 2 calls for different widths to split the workload. The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't be processed that way without modifications: some calls use unaligned buffers, and having branches to handle this was resulting in no micro-benchmark benefit. For block_w == 12 (around 1% of the pixels of the sequence): Before: 12758 decicycles in epel_uni, 4093 runs, 3 skips 19389 decicycles in qpel_uni, 8187 runs, 5 skips 22699 decicycles in epel_bi, 32743 runs, 25 skips 34736 decicycles in qpel_bi, 32733 runs, 35 skips After: 11929 decicycles in epel_uni, 4096 runs, 0 skips 18131 decicycles in qpel_uni, 8184 runs, 8 skips 20065 decicycles in epel_bi, 32750 runs, 18 skips 31458 decicycles in qpel_bi, 32753 runs, 15 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	dad7f15567	hevcdsp: remove more instances of compile-time-fixed parameters Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	d4f44b66d3	hevcdsp: remove compilation-time-fixed parameter The dststride parameter is always MAX_PB_SIZE. Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	54ca4dd43b	x86/hevc_res_add: refactor ff_hevc_transform_add{16,32}_8 * Reduced xmm register count to 7 (As such they are now enabled for x86_32). * Removed four movdqa (affects the sse2 version only). * pxor is now used to clear m0 only once. ~5% faster. Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
James Almer	76a99d467f	x86/hecv_res_add: add ff_hevc_transform_add{8,16,32}_8_avx ~15% faster than sse2 Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	11 years ago
Pierre Edouard Lepere	a6af4bf64d	x86: hevc: adding transform_add Reviewed-by: James Almer <jamrial@gmail.com> Approved-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	73c4f63ba5	x86/hevc_deblock: add add ff_hevc_[hv]_loop_filter_luma_{8, 10, 12}_avx ~5% faster than SSSE3 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	bfb3b2b7a6	x86/hevc_idct: add 12bit idct_dc Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	d4a9e89b27	avcodec/x86/hevcdsp_init: make license header consistent Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
James Almer	1ace9573dc	x86/hevc_idct: replace old and unused idct functions Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial). Benchmarks on an Intel Core i5-4200U: idct8x8_dc SSE2 MMXEXT C cycles 22 26 57 idct16x16_dc AVX2 SSE2 C cycles 27 32 249 idct32x32_dc AVX2 SSE2 C cycles 62 126 1375 Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Pierre Edouard Lepere	1a880b2fb8	hevc: SSE2 and SSSE3 loop filters Additional contributions by James Almer <jamrial@gmail.com>, Carl Eugen Hoyos <cehoyos@ag.or.at>, Fiona Glaser <fiona@x264.com> and Anton Khirnov <anton@khirnov.net> Signed-off-by: Anton Khirnov <anton@khirnov.net>	11 years ago
Mickaël Raulet	bd0f2d316f	x86/hevc: add 12bits support for MC cherry picked from commit 3fcb7a4595a6f40100a22110a5805e3b7510c0fd Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Mickaël Raulet	7bdcf5c934	x86/hevc: add 12bits support for deblocking filter cherry picked from commit 97d46afe320c7d61d7b9525e5f5588355cde4bb0 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Christophe Gisquet	670b7f203a	x86: hevcdsp: align Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
Michael Niedermayer	ca6b33b8bd	avcodec/x86/hevcdsp_init: Fix "warning: assignment from incompatible pointer type"	11 years ago
James Almer	276bef5340	x86/hevc_deblock: add ff_hevc_[hv]_loop_filter_luma_{8, 10}_sse2 Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Kieran Kunhya <kierank@obe.tv> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago
plepere	942e22c651	avcodec/x86/hevc: add avx2 dc idct Signed-off-by: Michael Niedermayer <michaelni@gmx.at>	11 years ago

1 2

63 Commits (ee419804da2a6a44a4af5d949869f0e98306d2fc)