Current code is written around the "simple" decode API's limitation that
a single input packet (AU/coded frame) triggers the output of at most
one output frame. However the spec contains two cases where a coded
frame may cause multiple frames to be output (cf. C.5.2.2.2):
* start of a new sequence
* overflowing sps_max_dec_pic_buffering
The decoder currently contains rather convoluted logic to handle these
cases:
* decode/output/per-frame sequence counters,
* HEVC_FRAME_FLAG_BUMPING
* ff_hevc_bump_frame()
* special clauses in ff_hevc_output_frame()
However, with the receive_frame() API none of that is necessary, as we
can just output multiple frames at once. Previously added ContainerFifo
allows that to be done in a straightforward and efficient manner.
It provides a FIFO for "container" objects like AVFrame/AVPacket and
features an integrated FFRefStructPool-based pool to avoid allocating an
freeing them repeatedly.
Fixes: use of uninitialized value
Fixes: 70929/clusterfuzz-testcase-minimized-ffmpeg_dem_IAMF_fuzzer-5931276639469568
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: left shift of 1 by 31 places cannot be represented in type 'int'
Fixes: 70726/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-6149928703819776
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Found by fuzzer.
Bug: https://crbug.com/356720789
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This results in an unnecessary ~800k allocation with H.264. A
nearby callsite uses avcodec_is_open() to avoid this, so do the
same when exiting avformat_find_stream_info().
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Usually the MediaCodec context will be released immediately, or it needs to stay
alive due to existing hardware buffers.
However we can free resources early in the case of
hw_buffer_count == 0 && refcount > 1, which can be reproduced by keeping frames
referenced after flushing and closing. mpv currently behaves like this.
Signed-off-by: sfan5 <sfan5@live.de>
Signed-off-by: Matthieu Bouron <matthieu.bouron@gmail.com>
Without resetting it, if there was a previous set of varstreams with
subtitles, it would subtract from all the streams, leading to chaos and
segfaults when trying to access for example stream -1.
In the absence of an RPU header, we can consult the colorspace tags to
make a more informed guess about whether we're looking at profile 5 or
profile 8.
This implements limited metadata compression. To be a bit more lenient,
we try and re-order the static extension blocks when testing for an
exact match.
For sanity, and to avoid producing bitstreams we couldn't ourselves
decode, we don't accept partial matches - if some extension blocks
change while others remain static, compression is disabled for the
entire frame.
This shouldn't be an issue in practice because static extension blocks
are stated to remain constant throughout the entire sequence.
This implements the limited DM metadata compression scheme described in
chapter 9 of the dolby vision bitstream specification.
The spec is a bit unclear about how to handle the presence of static
metadata inside compressed frames; in that it doesn't explicitly forbid
an encoder from repeating redundant metadata. In theory, we would need
to detect this case and then strip the corresponding duplicate metadata
from the existing set of static metadata. However, this is difficult to
implement - esspecially for the case of metadata blocks which may be
internally repeated (e.g. level 10).
That said, the spec states outright that static metadata should be
constant throughout the entire sequence, so a sane bitstream should not
have any static metadata values changing from one frame to the next (at
least up to a keyframe boundary), and therefore they should never be
present in compressed frames. As a consequence, it makes sense to treat
this as an error state regardless. (Ignoring them by default, or
erroring if either AV_EF_EXPLODE or AV_EF_AGGRESSIVE are set)
I was not able to find such samples in the wild (outside of artificially
produced test cases for this exact scenario), so I don't think we need
to worry about it until somebody produces one.
Keyframes must reset the metadata compression state, so we need to
also signal this at rpu generation time.
Default to uncompressed, because encoders cannot generally know if
a given frame will be a keyframe before they finish encoding, but also
cannot retroactively attach the RPU. (Within the confines of current
APIs)
And move the choice of desired container to `flags`. This is needed to
handle differing API requirements (e.g. libx265 requires the NAL RBSP,
but CBS BSF requires the unescaped bytes).
Limited mode can only ever maintain a single VDR RPU reference, and
furthermore requires vdr_rpu_id == 0. So in practice, it will only ever
use VDR RPU slot 0. All remaining slots get flushed in this case, to
avoid leaking partial state.
As the comment implies, DOVIContext.ext_blocks should also reflect the
current state after ff_dovi_rpu_generate().
Fluff for now, but will be needed once we start implementing metadata
compression for extension blocks as well.
Replace the manually specified chroma location by one using standard
notation, arbitrarily "bottomleft" as it is a less common path.
Required if we want to phase out the use of manual chroma locations.
The current logic hard-coded a check for v_sub == 1. We can extend this
logic slightly to cover the case of interlaced 4:1:0 (which has v_sub ==
2).
Here is a diagram explaining this scenario (with center-siting):
a a a a a a a a
b b b b b b b b
X X
a a a a a a a a
b b b b b b b b
a a a a a a a a
b b b b b b b b
Y Y
a a a a a a a a
b b b b b b b b
a = even luma rows
b = odd luma rows
X = even chroma sample
Y = odd chroma sample
In progressive mode, the chroma samples sit at (384, 384) respectively.
Relative to the 8x4 grid of even luma samples (a), the X sample sits at:
h_chr_pos = 384
v_chr_pos = 192
Relative to the 8x4 grid of odd luma samples (b), the Y sample sits at:
h_chr_pos = 384
v_chr_pos = 576
The new code calculates the correct values in all circumstances.
Currently, this just functions as a more principled and user-friendly
replacement for the (undocumented and hard to use) *_chr_pos fields.
However, the goal is to automatically infer these values from the input
frames' chroma location, and deprecate the manual use of *_chr_pos
altogether. (Indeed, my plans for an swscale replacement will most
likely also end up limiting the set of legal chroma locations to those
permissible by AVFrame properties)