FFmpeg

Commit Graph

Author	SHA1	Message	Date
Rémi Denis-Courmont	f883746587	lavc/flacdsp: do not assume maximum R-V VL This loop correctly assumes that VLMAX=16 (4x128-bit vectors with 32-bit elements) and 32 >= pred_order > 16. We need to alternate between VL=16 and VL=t2=pred_order-16 elements to add up to pred_order. The current code requests AVL=a2=pred_order elements. In QEMU and on thte K230 hardware, this sets VL=16 as we need. But the specification merely guarantees that we get: ceil(AVL / 2) <= VL <= VLMAX. For instance, if pred_order equals 27, we could end up with VL=14 or VL=15 instead of VL=16. So instead, request literally VLMAX=16.	8 months ago
Andreas Rheinhardt	aff24c1658	avcodec/flacdec: Remove unused variable Forgotten in `0380a03f1f`. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	8 months ago
Rémi Denis-Courmont	ba38d0e328	lavc/pixblockdsp: add scalar get_pixels_unaligned The code is already there, we just need to use it. get_pixels_unaligned_c: 2.2 get_pixels_unaligned_misaligned: 1.7	8 months ago
James Almer	0380a03f1f	avcodec/flacdsp: split off lpc33 into a dsp function Signed-off-by: James Almer <jamrial@gmail.com>	8 months ago
Haihao Xiang	8155808ce6	libavcodec/x86/vvc/vvc_sad: fix assembler error X86ASM libavcodec/x86/vvc/vvc_sad.o libavcodec/x86/vvc/vvc_sad.asm:85: error: invalid number of operands libavcodec/x86/vvc/vvc_sad.asm:87: error: invalid number of operands Signed-off-by: Haihao Xiang <haihao.xiang@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>	8 months ago
Stone Chen	0e52a4e434	libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC Implements AVX2 DMVR (decoder-side motion vector refinement) SAD functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128. To reduce complexity, SAD is only calculated on even rows. This is calculated for all video bitdepths, but the values passed to the function are always 16bit (even if the original video bitdepth is 8). The AVX2 implementation uses min/max/sub. Additionally this changes parameters dx and dy from int to intptr_t. This allows dx & dy to be used as pointer offsets without needing to use movsxd. Benchmarks ( AMD 7940HS ) Before: BQTerrace_1920x1080_60_10_420_22_RA.vvc \| 106.0 \| Chimera_8bit_1080P_1000_frames.vvc \| 204.3 \| NovosobornayaSquare_1920x1080.bin \| 197.3 \| RitualDance_1920x1080_60_10_420_37_RA.266 \| 174.0 \| After: BQTerrace_1920x1080_60_10_420_22_RA.vvc \| 109.3 \| Chimera_8bit_1080P_1000_frames.vvc \| 216.0 \| NovosobornayaSquare_1920x1080.bin \| 204.0\| RitualDance_1920x1080_60_10_420_37_RA.266 \| 181.7 \| Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	8 months ago
Rémi Denis-Courmont	910d281b21	lavc/h263dsp: R-V V {h,v}_loop_filter Since the horizontal and vertical filters are identical except for a transposition, this uses a common subprocedure with an ad-hoc ABI. To preserve return-address stack prediction, a link register has to be used (c.f. the "Control Transfer Instructions" from the RISC-V ISA Manual). The alternate/temporary link register T0 is used here, so that the normal RA is preserved (something Arm cannot do!). To load the strength value based on `qscale`, the shortest possible and PIC-compatible sequence is used: AUIPC; ADD; LBU. The classic LLA; ADD; LBU sequence would add one more instruction since LLA is a convenience alias for AUIPC; ADDI. To ensure that this trick works, relocation relaxation is disabled. To implement the two signed divisions by a power of two toward zero: (x / (1 << SHIFT)) the code relies on the small range of integers involved, computing: (x + (x >> (16 - SHIFT))) >> SHIFT rather than the more general: (x + ((x >> (16 - 1)) & ((1 << SHIFT) - 1))) >> SHIFT Thus one ANDI instruction is avoided. T-Head C908: h263dsp.h_loop_filter_c: 228.2 h263dsp.h_loop_filter_rvv_i32: 144.0 h263dsp.v_loop_filter_c: 242.7 h263dsp.v_loop_filter_rvv_i32: 114.0 (C is probably worse in real use due to less predictible branches.)	8 months ago
James Almer	3d1597d3e2	x86/vvc_alf: use the x86inc instruction macros Let its magic figure out the correct mnemonic based on target instruction set. Signed-off-by: James Almer <jamrial@gmail.com>	8 months ago
sunyuechi	0c1304ae11	lavc/vp9dsp: R-V V mc avg C908: vp9_avg4_8bpp_c: 1.2 vp9_avg4_8bpp_rvv_i64: 1.0 vp9_avg8_8bpp_c: 3.7 vp9_avg8_8bpp_rvv_i64: 1.5 vp9_avg16_8bpp_c: 14.7 vp9_avg16_8bpp_rvv_i64: 3.5 vp9_avg32_8bpp_c: 57.7 vp9_avg32_8bpp_rvv_i64: 10.0 vp9_avg64_8bpp_c: 229.0 vp9_avg64_8bpp_rvv_i64: 31.7 Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>	9 months ago
Rémi Denis-Courmont	7591eb4055	Revert "lavc/sbrdsp: R-V V neg_odd_64" While this function can easily be written with vectors, it just fails to get any performance improvement. For reference, this is a simpler loop-free implementation that does get better performance than the current one depending on hardware, but still more or less the same metrics as the C code: func ff_sbr_neg_odd_64_rvv, zve64x li a1, 32 addi a0, a0, 7 li t0, 8 vsetvli zero, a1, e8, m2, ta, ma li t1, 0x80 vlse8.v v8, (a0), t0 vxor.vx v8, v8, t1 vsse8.v v8, (a0), t0 ret endfunc This reverts commit `d06fd18f8f`.	9 months ago
Rémi Denis-Courmont	d452db8410	lavc/vc1dsp: R-V V vc1_unescape_buffer Notes: - The loop is biased toward no unescaped bytes as that should be most common. - The input byte array is slid rather than the (8 times smaller) bit-mask, as RISC-V V does not provide a bit-mask (or bit-wise) slide instruction. - There are two comparisons with 0 per iteration, for the same reason. - In case of match, bytes are copied until the first match, and the loop is restarted after the escape byte. Vector compression (vcompress.vm) could discard all escape bytes but that is slower if escape bytes are rare. Further optimisations should be possible, e.g.: - processing 2 bytes fewer per iteration to get rid of a 2 slides, - taking a short cut if the input vector contains less than 2 zeroes. But this is a good starting point: T-Head C908: vc1dsp.vc1_unescape_buffer_c: 12749.5 vc1dsp.vc1_unescape_buffer_rvv_i32: 6009.0 SpacemiT X60: vc1dsp.vc1_unescape_buffer_c: 11038.0 vc1dsp.vc1_unescape_buffer_rvv_i32: 2061.0	9 months ago
Nuo Mi	1b33c9a50a	avcodec/vvcdec: support Reference Picture Resampling passed clips: RPR_A_Alibaba_4.bit RPR_B_Alibaba_3.bit RPR_C_Alibaba_3.bit RPR_D_Qualcomm_1.bit VVC_HDR_UHDTV1_OpenGOP_Max3840x2160_50fps_HLG10_res_change_with_RPR.ts	9 months ago
Nuo Mi	cae0b01282	avcodec/vvcdec: increase edge_emu_buffer for RPR	9 months ago
Nuo Mi	7904ec2d34	avcodec/vvcdec: refact, remove hf_idx and vf_idx from mc_xxx's param list	9 months ago
Nuo Mi	77d971c348	avcodec/vvcdec: refact out luma_prof from luma_prof_bi	9 months ago
Nuo Mi	ac4575594f	avcodec/vvcdec: fix dmvr, bdof, cb_prof for RPR	9 months ago
Nuo Mi	77acd0a0dd	avcodec/vvcdec: inter, wait reference with a different resolution For RPR, the current frame may reference a frame with a different resolution. Therefore, we need to consider frame scaling when we wait for reference pixels.	9 months ago
Nuo Mi	deda59a996	avcodec/vvcdec: add RPR dsp	9 months ago
Nuo Mi	e70225e0a8	avcodec/vvcdec: emulated_edge, use reference frame's sps and pps a preparation for Reference Picture Resampling	9 months ago
Nuo Mi	aa8d5c6e7e	avcodec/vvcdec: add vvc inter filters for RPR	9 months ago
Nuo Mi	08ad51ece6	avcodec/vvcdec: refact, pred_get_refs return VVCRefPic instead of VVCFrame	9 months ago
Nuo Mi	66c6bee061	avcodec/vvcdec: refact out VVCRefPic from RefPicList	9 months ago
Nuo Mi	44bbafb69f	avcodec/vvcdec: refact, unify pred_regular_{luma, chroma} to pred_regular	9 months ago
Nuo Mi	875fa9692c	avcodec/vvcdec: misc, remove unused EMULATED_EDGE_{LUMA, CHROMA}, EMULATED_EDGE_DMVR_{LUAM, CHROMA}	9 months ago
Nuo Mi	84a93d91d1	avcodec/vvcdec: refact, unify {luma, chroma}_mc_bi to mc_bi	9 months ago
Nuo Mi	6769fe1614	avcodec/vvcdec: refact, unify {luma, chroma}_mc_uni to mc_uni	9 months ago
Nuo Mi	bc099afc8d	avcodec/vvcdec: refact, unify {luma, chroma}_mc to mc	9 months ago
Nuo Mi	1289da9244	avcodec/vvcdec: misc, inter, use is_chroma instead of is_luma	9 months ago
David Rosca	f7a1453f27	lavc/vaapi_decode: Reject decoding of frames with no slices Matches other hwaccels.	9 months ago
James Almer	b113050d96	avcodec/cbs_h266: read vps_ptl_max_tid before using it Reviewed-by: Nuo Mi <nuomi2021@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	9 months ago
Andreas Rheinhardt	2c94b1bbf1	avcodec/tiff: Fix leak on error Fixes Coverity issue #1516957. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	59b1838e09	avcodec/ac3enc: Move transient PutBitContext to stack Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	e863cbceae	avcodec/ac3enc_template: Avoid always-true check This might also help Coverity with issue #1596532. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	482afe8f3f	avcodec/lib*, avformat/tee: Simplify iterating over AVDictionary Reviewed-by: epirat07@gmail.com Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	a2874c5721	avcodec/aac_ac3_parser: Use ff_adts_header_parse_buf() instead of avpriv_adts_header_parse(). Using the former avoids an indirection. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	12ded9cd85	avcodec/adts_header: Add ff_adts_header_parse_buf() Most users of ff_adts_header_parse() don't already have an opened GetBitContext for the header, so add a convenience function for them. Also use a forward declaration of GetBitContext in adts_header.h as this avoids (implicit) inclusion of get_bits.h in some of the users that now no longer use a GetBitContext of their own. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	ae937c4902	avcodec/aac_ac3_parser: Untangle AAC and AC3 parsing error codes Also remove the (unused) AAC_AC3_PARSE_ERROR_CHANNEL_CFG while at it; furthermore, fix the documentation of ff_ac3_parse_header() and (ff\|avpriv)_adts_header_parse(). Reviewed-by: Andrew Sayers <ffmpeg-devel@pileofstuff.org> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Andreas Rheinhardt	6c812a80dd	avcodec/adts_parser: Don't presume buffer to be padded The documentation of av_adts_header_parse() does not require the buffer to be padded at all. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	9 months ago
Haihao Xiang	a00cfc6c24	lavc/qsvdec: require a dynamic frame pool if possible This allows a downstream element stores more frames from qsv decoders and fixes error in get_buffer(). $ ffmpeg -hwaccel qsv -hwaccel_output_format qsv -i input.mp4 -vf reverse -f null - [vist#0:0/h264 @ 0x562248f12c50] Decoding error: Cannot allocate memory [h264_qsv @ 0x562248f66b10] get_buffer() failed Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	9 months ago
Haihao Xiang	75015f9b0e	lavc/qsvenc: use the right info for encoding Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	9 months ago
Haihao Xiang	cda721e01d	lavc/qsv: fix the mfx allocator to support dynamic frame pool When the external allocator is used for dynamic frame allocation, only video memory is supported, the SDK doesn't lock/unlock the memory block via Lock()/Unlock() calls. Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>	9 months ago
Michael Niedermayer	e35fe3d8b9	avcodec/mscc & mwsc: Check loop counts before use This could cause timeouts Fixes: CID1439568 Untrusted loop bound Sponsored-by: Sovereign Tech Fund Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	b6b2b01025	avcodec/mpegvideo_enc: Fix potential overflow in RD Fixes: CID1500285 Unintentional integer overflow Sponsored-by: Sovereign Tech Fund Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	8fc649b931	avcodec/mpeg4videodec: assert impossible wrap points Helps: CID1473517 Uninitialized scalar variable Helps: CID1473497 Uninitialized scalar variable Sponsored-by: Sovereign Tech Fund Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	4c725df059	avcodec/mpeg12dec: Use 64bit in bit computation I dont think this can actually overflow but 64bit seems reasonable to use Fixes: CID1521983 Unintentional integer overflow Sponsored-by: Sovereign Tech Fund Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	6a9302739f	avcodec/vqcdec: Check init_get_bits8() for failure Fixes: CID1516090 Unchecked return value Sponsored-by: Sovereign Tech Fund Reviewed-by: Peter Ross <pross@xvid.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	4a8506c794	avcodec/vvc/dec: Check init_get_bits8() for failure Fixes: CID1560042 Unchecked return value Sponsored-by: Sovereign Tech Fund Reviewed-by: Nuo Mi <nuomi2021@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	dd5379db5d	avcodec/vble: Check av_image_get_buffer_size() for failure Fixes: CID1461482 Improper use of negative value Sponsored-by: Sovereign Tech Fund Reviewed-.by: "Xiang, Haihao" <haihao.xiang@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	1b991e77b9	avcodec/vp3: Replace check by assert Fixes: CID1452425 Logically dead code Sponsored-by: Sovereign Tech Fund Reviewed-by: Peter Ross <pross@xvid.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago
Michael Niedermayer	63feed1519	avcodec/vp8: Forward return of ff_vpx_init_range_decoder() Fixes: CID1507483 Unchecked return value Sponsored-by: Sovereign Tech Fund Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 months ago

1 2 3 4 5 ...

50131 Commits (95faf45af16b15d55b3d9f8f3244d1437649d763)