Following b8c45bbcbc they contain allocated
unit arrays which will get leaked. These operations were inconsistently
applied and never actually needed (the old uninit left them in the correct
state), so just drop them entirely.
Currently, a fragment's unit array is constantly reallocated during
splitting of a packet. This commit changes this: One can keep the units
array by distinguishing between the number of allocated and the number
of valid units in the units array.
The more units a packet is split into, the bigger the benefit.
So MPEG-2 benefits the most; for a video coming from an NTSC-DVD
(usually 32 units per frame) the average cost of cbs_insert_unit (for a
single unit) went down from 6717 decicycles to 450 decicycles (based
upon 10 runs with 4194304 runs each); if each packet consists of only
one unit, it went down from 2425 to 448; for a H.264 video where most
packets contain nine units, it went from 4431 to 450.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com>
This is in preparation for another patch that will stop needless
reallocations of the unit array.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com>
Improves speed of the testcase by about a factor of 10
Fixes: Timeout
Fixes: 13132/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664190616829952
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Speeds up error cases
Fixes: 13132/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664190616829952
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This speeds up the testcase by a factor of 4
Fixes: Timeout
Fixes: 13100/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WMV2_fuzzer-5767533905313792
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Improves speed from 5.4 to 4.2 seconds
Fixes: 13149/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PGM_fuzzer-5760833622114304
Fixes: 13166/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PGMYUV_fuzzer-5763216322330624
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The frame is not needed that early so obtaining it later avoids
the costly operation in case other checks fail.
Fixes: Timeout (14sec -> 4sec)
Fixes: 13140/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ZMBV_fuzzer-5738330308739072
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: signed integer overflow: 2147421862 - -33624063 cannot be represented in type 'int'
Fixes: 12885/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5733516975800320
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.
These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit b4b27dce95)
This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).
The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.
Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit 26d7af4c38)
- Clamp ME range to -64..63 (prevents corruption when me_range is too high)
- Allow MV's up to *and including* the positive range limit
- Allow out-of-edge ME by padding the prev buffer with a border of 0's
- Try previous MV before checking the rest (improves speed in some cases)
- More robust logic in code - ensure *mx,*my,*xored are updated together
- Improve block choices by counting 0-bytes in the entropy score
- Make histogram use uint16_t type, to allow byte counts from 16*16
(current block size) up to 255*255 (maximum allowed 8bpp block size)
- Make sure score table is big enough for a full block's worth of bytes
- Calculate *xored without using code in inner loop
Fixes: Out of array access
Fixes: 13090/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPEG4_fuzzer-5408668986638336
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is the equivalent change for cuviddec after the previous change
for nvdec. I made similar changes to the copying routines to handle
pixel formats in a more generic way.
Note that unlike with nvdec, there is no confusion about the ability
of a codec to output 444 formats. This is because the cuvid parser is
used, meaning that 444 JPEG content is still indicated as using a 420
output format.
With the introduction of HEVC 444 support, we technically have two
codecs that can handle 444 - HEVC and MJPEG. In the case of MJPEG,
it can decode, but can only output one of the semi-planar formats.
That means we need additional logic to decide whether to use a
444 output format or not.
The latest generation video decoder on the Turing chips supports
decoding HEVC 4:4:4. Supporting this is relatively straight-forward;
we need to account for the different chroma format and pick the
right output and sw formats at the right times.
There was one bug which was the hard-coded assumption that the
first chroma plane would be half-height; I fixed this to use the
actual shift value on the plane.
We also need to pass the SPS and PPS range extension flags.
We need all the flags to be exposed to be able to pass them on to
HW decoders. I did not attempt to nuance any of the warnings about
flags being unsupported as there's no way, at the point we extract
flags, to say whether an HW decoder is being used.