This is more a style fix than a bugfix (CID1604392 Overflowed constant)
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Found while reviewing CID1608712 Explicit null dereferenced
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Found by code review related to CID1604563 Overflowed return value
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Found while reviewing code related to CID1604409 Overflowed return value
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Found by code review related to CID1604386 Overflowed constant
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: left shift of negative value -208
Fixes: 69073/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PRORES_KS_fuzzer-4745020002336768
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Otherwise, slice index will never update for hwaccel decode, and slice
RPL will be always overlap into first one which use slice index to construct.
Fixes hwaccel decoding after 47d34ba7fb
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Slice address tab only been updated in software decode slice data.
Fixes hwaccel decoding after d725c737fe.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Not a bugfix, but might fix CID1604361 Overflowed constant
Sponsored-by: Sovereign Tech Fund
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
There are two implementations here:
- a generic scalable one processing two columns at a time,
- a specialised processing one (fixed-size) row at a time.
Unsurprisingly, the generic one works out better with smaller widths.
With larger widths, the gains from filling vectors are outweighed by
the extra cost of strided loads and stores. In other words, memory
accesses become the bottleneck.
T-Head C908:
h264_weight2_8_c: 54.5
h264_weight2_8_rvv_i32: 13.7
h264_weight4_8_c: 101.7
h264_weight4_8_rvv_i32: 27.5
h264_weight8_8_c: 197.0
h264_weight8_8_rvv_i32: 75.5
h264_weight16_8_c: 385.0
h264_weight16_8_rvv_i32: 74.2
SpacemiT X60:
h264_weight2_8_c: 48.5
h264_weight2_8_rvv_i32: 8.2
h264_weight4_8_c: 90.7
h264_weight4_8_rvv_i32: 16.5
h264_weight8_8_c: 175.0
h264_weight8_8_rvv_i32: 37.7
h264_weight16_8_c: 342.2
h264_weight16_8_rvv_i32: 66.0
In vtenc_populate_extradata, the cleanup function vtenc_reset should not
be used when no error occurs, otherwise some color information is lost
(#11036).
This patch checks the status code and conducts the correct cleanup.
Signed-off-by: Hao Guan <hguandl@gmail.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
While this *tends* to be faster than plain C, the performance numbers
are all over the place, presuambly due to the conditional character of
the main loop.
Some additional micro-optimisations should be feasible after the
underlying h264_idct_add and h264_idct_dc_add functions are also
implemented. Then it will no longer be necesseray to stricly abide by
the C ABI.
Performance is (unfortunately) the same as with non-MBAFF, since the
hardware under test does not short-circuit vector tail calculations.
(IMO, a generic solution or work-around should be agreed on, rather
than bespoke approaches all over the place.)
T-Head C908 (cycles):
h264_h_loop_filter_luma_8bpp_c: 297.5
h264_h_loop_filter_luma_8bpp_rvv_i32: 369.2
h264_v_loop_filter_luma_8bpp_c: 862.7
h264_v_loop_filter_luma_8bpp_rvv_i32: 199.7
Performance in the horizontal scenario seems worse than scalar. x86
SSE2 and AVX optimisations are similarly affected. This is presumably
caused by unlucky inputs from checkasm, such that the C code
short-circuits almost all filter calculations.
When mDCv support was added, there was a typo in both variable names
and also the MKTAG itself, incorrectly listing it as mDVc. The tag name
stands for Mastering Display Color Volume so mDCv is correct.
Typo originally introduced in 7894904141.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Ramiro Polla <ramiro.polla@gmail.com>
When mDCv support was added, there was a typo in both variable names
and also the MKTAG itself, incorrectly listing it as mDVc. The tag name
stands for Mastering Display Color Volume so mDCv is correct. See other
files such as av1dec.c which uses mdcv.
Typo originally introduced in c7a57b0f70.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Ramiro Polla <ramiro.polla@gmail.com>
The array in ff_aac_usac_mdst_filt_cur that is passed to that has a size
of 7 elements, not 6 and the code in the function accesses the array at
index 6, which would be out of bounds if the size was actually 6.
Fixes: CID1603196
The checked entity should be alone on one side of the check, this avoids
complex considerations of overflows.
This fixes a issue of bad style in our code and a coverity issue.
Fixes: CID1439654 Untrusted pointer read
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1107820800 + 1107820800 cannot be represented in type 'int'
Fixes: left shift of 1091059712 by 6 places cannot be represented in type 'int'
Fixes: 69910/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-5162839971528704
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch is to make FFHWBaseEncodeContext a standalone component
and avoid getting FFHWBaseEncodeContext from avctx->priv_data.
This patch also removes some unnecessary AVCodecContext arguments.
For receive_packet call, a small wrapper is introduced.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
This implementation is based on D3D12 Video Encoding Spec:
https://microsoft.github.io/DirectX-Specs/d3d/D3D12VideoEncoding.html
Sample command line for transcoding:
ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4
-c:v hevc_d3d12va output.mp4
Signed-off-by: Tong Wu <tong1.wu@intel.com>