Since we only know whether a NAL unit corresponds to a new field after
parsing the slice header, this requires reorganizing the calls to slice
parsing, per-slice/field/frame init and actual decoding.
In the previous code, the function for slice header decoding also
immediately started a new field/frame as necessary, so any slices
already queued for decoding would no longer be decodable.
After this patch, we first parse the slice header, and if we determine
that a new field needs to be started we decode all the queued slices.
This function's purpose is not very well defined. Currently it does two
(only marginally related) things: selecting the next output frame and
calling ff_thread_finish_setup() for frame threading. The first of those
more properly belongs under field_start(), while the second can be
called directly from decode_nal_units().
Move the NAL unit types into it. This will allow to stop including the
whole decoder-specific h264dec.h in some code that is unrelated to the
decoder and only needs some enum values.
While the value of those variables will be constant for the whole frame,
they are only used in two functions called from slice header decoding.
Moving them to the per-slice context allows us to make the H264Context
passed to slice_header_parse() constant.
Replace the decoder-global nal_unit_type/nal_ref_idc variables with the
per-NAL ones. The decoder-global ones still cannot be removed because
they are used by hwaccels.
Do it right before the MMCOs are applied to the DPB. This will allow
moving the frame_start() call out of the slice header parsing, since
generating the implicit MMCOs needs to be done after frame_start().
They are stored in the slice header, so technically they are per-slice
(though they must be the same in every slice). This will simplify the
following commits.
This will allow postponing the reference list construction (and by
consequence some other functions, like frame_start) until the whole
slice header has been parsed.
In such a case, decode the MBs in parallel without the loop filter, then
execute the filter serially.
The ref2frm array was previously moved to H264SliceContext. That was
incorrect, since it applies to all the slices and should properly be in
H264Context (it did not actually break decoding, since this distinction
only becomes relevant with slice threading and deblocking_filter=1,
which was not implemented before this commit). The ref2frm array is thus
moved back to H264Context.
It is always unconditionally initialized in decode_postinit() and then
immediately used in one place further below. All the other places where
it is accessed are just useless fluff.
Make the SPS/PPS parsing independent of the H264Context, to allow
decoupling the parser from the decoder. The change is modelled after the
one done earlier for HEVC.
Move the dequant buffers to the PPS to avoid complex checks whether they
changed and an expensive copy for frame threads.
According to the spec, the reference list for a slice should be
constructed by first generating an initial (what we now call "default")
reference list and then optionally applying modifications to it.
Our code has an optimization where the initial reference list is
constructed for the first inter slice and then rebuilt for other slices
if needed. This, however, adds complexity to the code, requires an extra
2.5kB array in the codec context and there is no reason to think that it
has any positive effect on performance. Therefore, simplify the code by
generating the reference list from scratch for each slice.
This makes the h.264 decoder threadsafe to initialize.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>