In the absence of an RPU header, we can consult the colorspace tags to
make a more informed guess about whether we're looking at profile 5 or
profile 8.
This implements limited metadata compression. To be a bit more lenient,
we try and re-order the static extension blocks when testing for an
exact match.
For sanity, and to avoid producing bitstreams we couldn't ourselves
decode, we don't accept partial matches - if some extension blocks
change while others remain static, compression is disabled for the
entire frame.
This shouldn't be an issue in practice because static extension blocks
are stated to remain constant throughout the entire sequence.
Keyframes must reset the metadata compression state, so we need to
also signal this at rpu generation time.
Default to uncompressed, because encoders cannot generally know if
a given frame will be a keyframe before they finish encoding, but also
cannot retroactively attach the RPU. (Within the confines of current
APIs)
And move the choice of desired container to `flags`. This is needed to
handle differing API requirements (e.g. libx265 requires the NAL RBSP,
but CBS BSF requires the unescaped bytes).
Limited mode can only ever maintain a single VDR RPU reference, and
furthermore requires vdr_rpu_id == 0. So in practice, it will only ever
use VDR RPU slot 0. All remaining slots get flushed in this case, to
avoid leaking partial state.
As the comment implies, DOVIContext.ext_blocks should also reflect the
current state after ff_dovi_rpu_generate().
Fluff for now, but will be needed once we start implementing metadata
compression for extension blocks as well.
These two fields are coded together into a single 16 bit integer with upper 8
bits for ext_mapping_idc and lower 8 bits for el_bit_depth_minus8.
Furthermore ext_mapping_idc has two components, upper 3 bits and lower 5 bits.
Co-authored-by: Niklas Haas <git@haasn.dev>
Signed-off-by: Niklas Haas <git@haasn.dev>
Despite the suggestive size limits, this metadata ID has nothing to do
with the VDR metadata ID used for the data mappings. Actually, the
specification leaves them wholly unexplained, other than acknowleding
their existence. Must be some secret dolby sauce. They're not even
involved in DM metadata compression, which is handled using an entirely
separate ID.
That leaves us with a lack of anything sensible to do with these IDs.
Since we unfortunately only expose one `dm_metadata_id` field to the
user, just ensure that they match; which appears to always be the case
in practice. If somebody ever hits this error, I would really much
rather like to see the triggering file.
When this is 0, the metadata is explicitly inferred to stated default
values from the spec, rather than inferred from the previous frame's
values.
Likewise, when encoding, instead of checking if the value changed since
the last frame, we need to check if it differs from the default.
The code as written was wrong. In the spec, these fields are treated
merely as plain integers in the range 0 to 4095. The only difference
between L2 and L8 is that L2.ms_weight also accepts an additional value
of -1, hence the extra sign bit. While it's likely that these are still
shifted integers in disguise, since all real-world samples seem to use
a value of 2048 here, the offset used in the code was wrong.
In addition, because the l8.ms_weight struct member is unsigned, these
wrong shifting semantics ended up overflowing the field, leading to
undefined behavior when transcoding. Fortunately, the damage was
relatively contained in practice, because it just corrupts the coding of
this field, which is ignored in practice in all implementations I have
seen.
Code is taken from dovi_rpudec
Fixes: CID1596604 Uninitialized scalar variable
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This function takes a decoded AVDOVIMetadata struct and turns it back
into a binary RPU. Verified using existing tools, and matches the
bitstream in official reference files.
I decided to just roll the EMDF and NAL encapsulation into this function
because the end user will need to do it otherwise anyways.
We need to set up the configuration struct appropriately based on the
codec type, colorspace metadata, and presence/absence of an EL (though,
we currently don't support an EL).
When present, we use the signalled RPU data header to help infer (and
validate) the right values.
Behavior can be controlled by a new DOVIContext.enable flag.