FFmpeg

Mirror of https://git.ffmpeg.org/ffmpeg.git https://ffmpeg.org/

Martin Storsjö f872b19714 aarch64: hevc: Produce plain neon versions of qpel_bi_hv As the plain neon qpel_h functions process two rows at a time, we need to allocate storage for h+8 rows instead of h+7. By allocating storage for h+8 rows, incrementing the stack pointer won't end up at the right spot in the end. Store the intended final stack pointer value in a register x14 which we store on the stack. AWS Graviton 3: put_hevc_qpel_bi_hv4_8_c: 385.7 put_hevc_qpel_bi_hv4_8_neon: 131.0 put_hevc_qpel_bi_hv4_8_i8mm: 92.2 put_hevc_qpel_bi_hv6_8_c: 701.0 put_hevc_qpel_bi_hv6_8_neon: 239.5 put_hevc_qpel_bi_hv6_8_i8mm: 191.0 put_hevc_qpel_bi_hv8_8_c: 1162.0 put_hevc_qpel_bi_hv8_8_neon: 228.0 put_hevc_qpel_bi_hv8_8_i8mm: 225.2 put_hevc_qpel_bi_hv12_8_c: 2305.0 put_hevc_qpel_bi_hv12_8_neon: 558.0 put_hevc_qpel_bi_hv12_8_i8mm: 483.2 put_hevc_qpel_bi_hv16_8_c: 3965.2 put_hevc_qpel_bi_hv16_8_neon: 732.7 put_hevc_qpel_bi_hv16_8_i8mm: 656.5 put_hevc_qpel_bi_hv24_8_c: 8709.7 put_hevc_qpel_bi_hv24_8_neon: 1555.2 put_hevc_qpel_bi_hv24_8_i8mm: 1448.7 put_hevc_qpel_bi_hv32_8_c: 14818.0 put_hevc_qpel_bi_hv32_8_neon: 2763.7 put_hevc_qpel_bi_hv32_8_i8mm: 2468.0 put_hevc_qpel_bi_hv48_8_c: 32855.5 put_hevc_qpel_bi_hv48_8_neon: 6107.2 put_hevc_qpel_bi_hv48_8_i8mm: 5452.7 put_hevc_qpel_bi_hv64_8_c: 57591.5 put_hevc_qpel_bi_hv64_8_neon: 10660.2 put_hevc_qpel_bi_hv64_8_i8mm: 9580.0 Signed-off-by: Martin Storsjö <martin@martin.st>		8 months ago
..
Makefile	avcodec: Remove DCT, FFT, MDCT and RDFT	1 year ago
aacpsdsp_init_aarch64.c	…
aacpsdsp_neon.S	aarch64: Reindent all assembly to 8/24 column indentation	1 year ago
cabac.h	…
fmtconvert_init.c	…
fmtconvert_neon.S	…
h264chroma_init_aarch64.c	…
h264cmc_neon.S	aarch64: Lowercase UXTW/SXTW and similar flags	1 year ago
h264dsp_init_aarch64.c	…
h264dsp_neon.S	aarch64: Make the indentation more consistent	1 year ago
h264idct_neon.S	aarch64: Lowercase UXTW/SXTW and similar flags	1 year ago
h264pred_init.c	…
h264pred_neon.S	…
h264qpel_init_aarch64.c	lavc/aarch64: h264qpel, add 10-bit lowpass_8_10 based functions	12 months ago
h264qpel_neon.S	lavc/aarch64: h264qpel, add 10-bit lowpass_8_10 based functions	12 months ago
hevcdsp_deblock_neon.S	avcodec/aarch64/hevc: add luma deblock NEON	9 months ago
hevcdsp_epel_neon.S	aarch64: hevc: Produce epel_bi_hv functions for both neon and i8mm	8 months ago
hevcdsp_idct_neon.S	aarch64: Make the indentation more consistent	1 year ago
hevcdsp_init_aarch64.c	aarch64: hevc: Produce plain neon versions of qpel_bi_hv	8 months ago
hevcdsp_qpel_neon.S	aarch64: hevc: Produce plain neon versions of qpel_bi_hv	8 months ago
hevcdsp_sao_neon.S	…
hpeldsp_init_aarch64.c	…
hpeldsp_neon.S	aarch64: Consistently use lowercase for vector element specifiers	1 year ago
idct.h	…
idctdsp_init_aarch64.c	…
idctdsp_neon.S	…
me_cmp_init_aarch64.c	…
me_cmp_neon.S	aarch64: Consistently use lowercase for vector element specifiers	1 year ago
mpegaudiodsp_init.c	…
mpegaudiodsp_neon.S	lavc/hevcdsp_qpel_neon: using movi.16b instead of movi.2d	12 months ago
neon.S	aarch64: Consistently use lowercase for vector element specifiers	1 year ago
neontest.c	…
opusdsp_init.c	…
opusdsp_neon.S	aarch64: Reindent all assembly to 8/24 column indentation	1 year ago
pixblockdsp_init_aarch64.c	…
pixblockdsp_neon.S	…
rv40dsp_init_aarch64.c	…
sbrdsp_init_aarch64.c	…
sbrdsp_neon.S	aarch64: Consistently use lowercase for vector element specifiers	1 year ago
simple_idct_neon.S	aarch64: Consistently use lowercase for vector element specifiers	1 year ago
synth_filter_init.c	avcodec: Remove DCT, FFT, MDCT and RDFT	1 year ago
synth_filter_neon.S	avcodec: Remove DCT, FFT, MDCT and RDFT	1 year ago
vc1dsp_init_aarch64.c	…
vc1dsp_neon.S	…
videodsp.S	…
videodsp_init.c	…
vorbisdsp_init.c	…
vorbisdsp_neon.S	…
vp8dsp.h	…
vp8dsp_init_aarch64.c	…
vp8dsp_neon.S	aarch64: Make the indentation more consistent	1 year ago
vp9dsp_init.h	…
vp9dsp_init_10bpp_aarch64.c	…
vp9dsp_init_12bpp_aarch64.c	…
vp9dsp_init_16bpp_aarch64_template.c	…
vp9dsp_init_aarch64.c	…
vp9itxfm_16bpp_neon.S	…
vp9itxfm_neon.S	…
vp9lpf_16bpp_neon.S	…
vp9lpf_neon.S	…
vp9mc_16bpp_neon.S	…
vp9mc_aarch64.S	…
vp9mc_neon.S	…