It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail.
This allows for considerable simplifications in
ff_h264_(ref|replace)_picture().
Switching to the RefStruct API also allows to make H264Picture
smaller, because some AVBufferRef* pointers could be removed
without replacement.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary since the removal of non-thread-safe callbacks
in e0786a8eeb. Since then, the
AVCodecContext has only been used as logcontext.
Removing ff_thread_release_buffer() allowed to remove AVCodecContext*
parameters from several other functions (not only unref functions,
but also e.g. ff_h264_ref_picture() which calls ff_h264_unref_picture()
on error).
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Forgotten in 0eb399ac39.
While just at it, also use a forward declaration.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and therefore error checks: Syncing
hwaccel_picture_private across threads can't fail any more.
Also gets rid of an unnecessary pointer in structures and
in the parameter list of ff_hwaccel_frame_priv_alloc().
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Lynne <dev@lynne.ee>
Tested-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks for these allocations;
e.g. syncing buffers across threads can't fail any more
and needn't be checked. It also avoids having to keep
H264ParamSets.pps and H264ParamSets.pps_ref and PPS.sps
and PPS.sps_ref in sync and gets rid of casts and indirections.
(The removal of these checks and the syncing code even more
than offset the additional code for RefStruct.)
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When using multi-threaded decoding, every decoding thread
has its own DBP consisting of H264Pictures and each of these
points to its own AVFrames. They are synced during
update_thread_context via av_frame_ref() and therefore
the threads actually decoding (as well as all the others)
must not modify any field that is copied by av_frame_ref()
after ff_thread_finish_setup().
Yet this is exactly what happens when an error occurs
during decoding and the AVFrame's decode_error_flags are updated.
Given that these errors only become apparent during decoding,
this can't be set before ff_thread_finish_setup() without
defeating the point of frame-threading; in practice,
this meant that the decoder did not set these flags correctly
in case frame-threading was in use. (This means that e.g.
the ffmpeg cli tool fails to output its "corrupt decoded frame"
message in a nondeterministic fashion.)
This commit fixes this by adding a new H264Picture field
that is actually propagated across threads; the field
is an AVBufferRef* whose data is an atomic_int; it is
atomic in order to allow multiple threads to update it
concurrently and not to provide synchronization
between the threads setting the field and the thread
ultimately returning the AVFrame.
This unfortunately has the overhead of one allocation
per H264Picture (both the original one as well as
creating a reference to an existing one), even in case
of no errors. In order to mitigate this, an AVBufferPool
has been used and only if frame-threading is actually
in use. This expense will be removed as soon as
a proper API for refcounted objects (not based upon
AVBuffer) is in place.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Most of the inline functions in h264dec.h are only used
by h264_cavlc.c and h264_cabac.c. Therefore move them
to the common header for these two, namely h264_mvpred.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Modifying the main context by a slice thread is racy;
so constify the pointer to it in H264SliceContext.
The code itself was already compatible with this change.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Since 7be2d2a70c only one context
is used. Moving it to H264Context from H264SliceContext is natural.
One could access the ERContext from H264SliceContext
via H264SliceContext.h264->er; yet H264SliceContext.h264 should
naturally be const-qualified, because slice threads should not
modify the main context. The ERContext is an exception
to this, as ff_er_add_slice() is intended to be called simultaneously
by multiple threads. And for this one needs a pointer whose
pointed-to-type is not const-qualified.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only used by the H.264 decoder (as well as the dirac decoder,
which already uses a local copy).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The majority of frame-threaded decoders (mainly the intra-only)
need exactly one part of ThreadFrame: The AVFrame. They don't
need the owners nor the progress, yet they had to use it because
ff_thread_(get|release)_buffer() requires it.
This commit changes this and makes these functions work with ordinary
AVFrames; the decoders that need the extra fields for progress
use ff_thread_(get|release)_ext_buffer() which work exactly
as ff_thread_(get|release)_buffer() used to do.
This also avoids some unnecessary allocations of progress AVBuffers,
namely for H.264 and HEVC film grain frames: These frames are not
used for synchronization and therefore don't need a ThreadFrame.
Also move the ThreadFrame structure as well as ff_thread_ref_frame()
to threadframe.h, the header for frame-threaded decoders with
inter-frame dependencies.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
E.g. the inclusion of parser.h comes from a time when
the parser used a H264Context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The only thing that is actually used directly from there is the
PART_NOT_AVAILABLE constant, which can be moved to h264pred.h.
Otherwise it only depends on other indirectly included headers.
Because we need access to ref frames without film grain applied, we have
to add an extra AVFrame to H264Picture to avoid messing with the
original. This requires some amount of overhead to make the reference
moves work out, but it allows us to benefit from frame multithreading
for film grain application "for free".
Unfortunately, this approach requires twice as much RAM to be constantly
allocated for ref frames, due to the need for an extra buffer per
H264Picture. In theory, we could get away with freeing up this memory as
soon as it's no longer needed (since ref frames do not need film grain
buffers any longer), but trying to call ff_thread_release_buffer() from
output_frame() conflicts with possible later accesses to that same frame
and I'm not sure how to synchronize that well.
Tested on all three cases of (no fg), (fg present but exported) and (fg
present and not exported), with and without threading.
Co-authored-by: James Almer <jamrial@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
From SMPTE RDD 5-2006, the grain seed is to be computed from the
following definition of `pic_offset`:
> When decoding H.264 | MPEG-4 AVC bitstreams, pic_offset is defined as
> follows:
> - pic_offset = PicOrderCnt(CurrPic) + (PicOrderCnt_offset << 5)
> where:
> - PicOrderCnt(CurrPic) is the picture order count of the current frame,
> which shall be derived from [the video stream].
>
> - PicOrderCnt_offset is set to idr_pic_id on IDR frames. idr_pic_id
> shall be read from the slice header of [the video stream]. On non-IDR I
> frames, PicOrderCnt_offset is set to 0. A frame shall be classified as I
> frame when all its slices are I slices, which may be optionally
> designated by setting primary_pic_type to 0 in the access delimiter NAL
> unit. Otherwise, PicOrderCnt_offset it not changed. PicOrderCnt_offset is
> updated in decoding order.
Co-authored-by: James Almer <jamrial@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
Will remove unnecessary allocations when both src and dst picture contain
references to the same buffers.
Signed-off-by: James Almer <jamrial@gmail.com>
Once removed in 4a9bab3db0.
Introduced again in b25cd7540e.
Signed-off-by: Linjie Fu <fulinjie@zju.edu.cn>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Do not use the one in the SEI directly as that is reset at certain
points.
Inspired by patches from Michael Niedermayer <michaelni@gmx.at> and
Anton Mitrofanov <BugMaster@narod.ru>.
CC: libav-stable@libav.org
This merges commit 4fded0480f from libav,
originally written by Anton Khirnov and skipped in
fc63d5ceb3.
libavcodec/h264_slice.c | 20 +++++++++++++-------
libavcodec/h264dec.c | 3 +++
libavcodec/h264dec.h | 5 +++++
3 files changed, 21 insertions(+), 7 deletions(-)
This treats the case of no slices like no frames which it basically is.
The field is added to the context as other nal related fields are also there
and passing the has_slices field per *arguments is ugly and not consistent
Found-by: ubitux
Approved-by: ubitux
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The current condition can trigger in cases where it shouldn't, with
unexpected results.
Make sure that:
- container cropping is really based on the original dimensions from the
caller
- those dimenions are discarded on size change
The code is still quite hacky and eventually should be deprecated and
removed, with the decision about which cropping is used delegated to the
caller.
Calling ff_h264_field_end() when the per-field state is not properly
initialized leads to all kinds of undefined behaviour.
CC: libav-stable@libav.org
Bug-Id: 977 978 992
The parser changes have lost the support for the needed padding, this adds it back
Fixes out of array reads
Fixes: 03ea21d271abc8acf428d42ace51d8b4/asan_heap-oob_3358eef_5692_16f0cc01ab5225e9ce591659e5c20e35.mkv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since we only know whether a NAL unit corresponds to a new field after
parsing the slice header, this requires reorganizing the calls to slice
parsing, per-slice/field/frame init and actual decoding.
In the previous code, the function for slice header decoding also
immediately started a new field/frame as necessary, so any slices
already queued for decoding would no longer be decodable.
After this patch, we first parse the slice header, and if we determine
that a new field needs to be started we decode all the queued slices.
This function's purpose is not very well defined. Currently it does two
(only marginally related) things: selecting the next output frame and
calling ff_thread_finish_setup() for frame threading. The first of those
more properly belongs under field_start(), while the second can be
called directly from decode_nal_units().