Reorganize the code such that the frame threading code does not call the
decoders directly, but instead calls back into the generic decoding
code. This avoids duplicating the logic that wraps the decoder
invocation and allows receive_frame()-based decoders to use frame
threading.
Further work by Timo Rothenpieler <timo@rothenpieler.org>.
Given that MPVPictures are already directly shared between threads
in case of frame-threaded decoding, one can simply use it to
pass decoding progress information between threads. This allows
to avoid one level of indirection; it also means avoids allocations
(of the ThreadFrameProgress structure) in case of frame-threading
and indeed makes ff_thread_release_ext_buffer() decoder-only
(actually, H.264-decoder-only).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit is the analog of 3f11eac757
for decoding: It sets the AV_FRAME_FLAG_KEY and (for video decoders)
also pict_type to AV_PICTURE_TYPE_I. It furthermore stops setting
audio frames as always being key frames -- it is wrong for e.g.
TrueHD/MLP. The latter also affects TAK and DFPWM.
The change already improves output for several decoders where
it has been forgotten to set e.g. pict_type like speedhq, wnv1
or tiff. The latter is the reason for the change to the exif-image-tiff
FATE test reference file.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Happens in the mov-elst-ends-betn-b-and-i and mov-ibi-elst-starts-b
FATE tests with frame-threading.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Before commit f025b8e110,
every frame-threaded decoder used ThreadFrames, even when
they did not have any inter-frame dependencies at all.
In order to distinguish those decoders that need the AVBuffer
for progress communication from those that do not (to avoid
the allocation for the latter), the former decoders were marked
with the FF_CODEC_CAP_ALLOCATE_PROGRESS internal codec cap.
Yet distinguishing these two can be done in a more natural way:
Don't use ThreadFrames when not needed and split ff_thread_get_buffer()
into a core function that calls the user's get_buffer2 callback
and a wrapper around it that also allocates the progress AVBuffer.
This has been done in 02220b88fc
and since that commit the ALLOCATE_PROGRESS cap was nearly redundant.
The only exception was WebP and VP8. WebP can contain VP8
and uses the VP8 decoder directly (i.e. they share the same
AVCodecContext). Both decoders are frame-threaded and VP8
has inter-frame dependencies (in general, not in valid WebP)
and therefore the ALLOCATE_PROGRESS cap. In order to avoid
allocating progress in case of a frame-threaded WebP decoder
the cap and the check for the cap has been kept in place.
Yet now the VP8 decoder has been switched to use ProgressFrames
and therefore there is just no reason any more for this check
and the cap. This commit therefore removes both.
Also change the value of FF_CODEC_CAP_USES_PROGRESSFRAMES
to leave no gaps.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is useful when the lifetime of the object to be shared
is the whole decoding process as it allows to avoid having
to sync them every time in update_thread_context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Frame-threaded decoders with inter-frame dependencies
use the ThreadFrame API for syncing. It works as follows:
During init each thread allocates an AVFrame for every
ThreadFrame.
Thread A reads the header of its packet and allocates
a buffer for an AVFrame with ff_thread_get_ext_buffer()
(which also allocates a small structure that is shared
with other references to this frame) and sets its fields,
including side data. Then said thread calls ff_thread_finish_setup().
From that moment onward it is not allowed to change any
of the AVFrame fields at all any more, but it may change
fields which are an indirection away, like the content of
AVFrame.data or already existing side data.
After thread A has called ff_thread_finish_setup(),
another thread (the user one) calls the codec's update_thread_context
callback which in turn calls ff_thread_ref_frame() which
calls av_frame_ref() which reads every field of A's
AVFrame; hence the above restriction on modifications
of the AVFrame (as any modification of the AVFrame by A after
ff_thread_finish_setup() would be a data race). Of course,
this av_frame_ref() also incurs allocations and therefore
needs to be checked. ff_thread_ref_frame() also references
the small structure used for communicating progress.
This av_frame_ref() makes it awkward to propagate values that
only become known during decoding to later threads (in case of
frame reordering or other mechanisms of delayed output (like
show-existing-frames) it's not the decoding thread, but a later
thread that returns the AVFrame). E.g. for VP9 when exporting video
encoding parameters as side data the number of blocks only
becomes known during decoding, so one can't allocate the side data
before ff_thread_finish_setup(). It is currently being done afterwards
and this leads to a data race in the vp9-encparams test when using
frame-threading. Returning decode_error_flags is also complicated
by this.
To perform this exchange a buffer shared between the references
is needed (notice that simply giving the later threads a pointer
to the original AVFrame does not work, because said AVFrame will
be reused lateron when thread A decodes the next packet given to it).
One could extend the buffer already used for progress for this
or use a new one (requiring yet another allocation), yet both
of these approaches have the drawback of being unnatural, ugly
and requiring quite a lot of ad-hoc code. E.g. in case of the VP9
side data mentioned above one could not simply use the helper
that allocates and adds the side data to an AVFrame in one go.
The ProgressFrame API meanwhile offers a different solution to all
of this. It is based around the idea that the most natural
shared object for sharing information about an AVFrame between
decoding threads is the AVFrame itself. To actually implement this
the AVFrame needs to be reference counted. This is achieved by
putting a (ownership) pointer into a shared (and opaque) structure
that is managed by the RefStruct API and which also contains
the stuff necessary for progress reporting.
The users get a pointer to this AVFrame with the understanding
that the owner may set all the fields until it has indicated
that it has finished decoding this AVFrame; then the users are
allowed to read everything. Every decoder may of course employ
a different contract than the one outlined above.
Given that there is no underlying av_frame_ref(), creating
references to a ProgressFrame can't fail. Only
ff_thread_progress_get_buffer() can fail, but given that
it will replace calls to ff_thread_get_ext_buffer() it is
at places where errors are already expected and properly
taken care of.
The ProgressFrames are empty (i.e. the AVFrame pointer is NULL
and the AVFrames are not allocated during init at all)
while not being in use; ff_thread_progress_get_buffer() both
sets up the actual ProgressFrame and already calls
ff_thread_get_buffer(). So instead of checking for
ThreadFrame.f->data[0] or ThreadFrame.f->buf[0] being NULL
for "this reference frame is non-existing" one should check for
ProgressFrame.f.
This also implies that one can only set AVFrame properties
after having allocated the buffer. This restriction is not deep:
if it becomes onerous for any codec, ff_thread_progress_get_buffer()
can be broken up. The user would then have to get a buffer
himself.
In order to avoid unnecessary allocations, the shared structure
is pooled, so that both the structure as well as the AVFrame
itself are reused. This means that there won't be lots of
unnecessary allocations in case of non-frame-threaded decoding.
It might even turn out to have fewer than the current code
(the current code allocates AVFrames for every DPB slot, but
these are often excessively large and not completely used;
the new code allocates them on demand). Pooling relies on the
reset function of the RefStruct pool API, it would be impossible
to implement with the AVBufferPool API.
Finally, ProgressFrames have no notion of owner; they are built
on top of the ThreadProgress API which also lacks such a concept.
Instead every ThreadProgress and every ProgressFrame contains
its own mutex and condition variable, making it completely independent
of pthread_frame.c. Just like the ThreadFrame API it is simply
presumed that only the actual owner/producer of a frame reports
progress on said frame.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary since the removal of non-thread-safe callbacks
in e0786a8eeb. Since then, the
AVCodecContext has only been used as logcontext.
Removing ff_thread_release_buffer() allowed to remove AVCodecContext*
parameters from several other functions (not only unref functions,
but also e.g. ff_h264_ref_picture() which calls ff_h264_unref_picture()
on error).
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks and allows to remove
cleanup code for earlier allocations. Also avoids casts
and indirections.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and frees and error checks for said allocations;
also avoids a few indirections and casts.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Since 432adca5fe no decoder
looks at the slice_count and slice_offset fields at all,
so there is no reason to synchronize them between the worker
and the user thread.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The '<' in ///< says that the comment pertains to the previous item
on the same line as the ///<.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit is the AVHWAccel analogue of commit
20f972701806be20a77f808db332d9489343bb78: It moves the private fields
of AVHWAccel to a new struct FFHWAccel extending AVHWAccel
in an internal header (namely hwaccel_internal.h).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
libavcodec/hwconfig.h currently contains HWACCEL_CAP_* flags
as well as the definition of AVCodecHWConfigInternal and some
macros to create them.
The users of these two are nearly disjoint: The flags are used
by files providing AVHWAccels whereas AVCodecHWConfigInternal
is used by files providing codecs (for FFCodec.hw_configs).
This patch therefore moves these flags to a new file hwaccel_internal.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The issue is that with a threadsafe hwaccel and multiple enabled
frame threads, hwaccel->uninit() is never called.
Previously, the function was guaranteed to never have any threads
with hwaccel contexts, so it never bothered to uninit any.
For encoding, this field is entirely redundant with
AVCodecContext.framerate.
For decoding, this field is entirely redundant with
AV_CODEC_PROP_FIELDS.
Frame counters can overflow relatively easily (INT_MAX number of frames is
slightly more than 1 year for 60 fps content), so make sure we use 64 bit
values for them.
Also deprecate the old 32 bit frame_number attribute.
Signed-off-by: Marton Balint <cus@passwd.hu>
Making it point to the input packet results in different behavior during flush,
where its contents will be that of an empty packet instead of containing the
props from the last input packet fed to the decoder.
After this change, decoding with more than one thread will shield the same
results as using a single thread.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes assertion failures after avcodec_flush_buffers(), where
stashed hwaccel state is present, but prev_thread is NULL.
Found-by: Wang Bin <wbsecg1@gmail.com>
This state is not refcounted, so make sure it always has a well-defined
owner.
Remove the block added in 091341f2ab, as
this commit also solves that issue in a more general way.
Currently, ff_frame_thread_free() uses the last worker thread
to updates the first worker thread via update_context_from_thread()
immediately before freeing all these worker threads. This is
a remnant of the time in which the first worker was special.
(E.g. the first worker shared its AVCodecInternal with the public
AVCodecContext.)
But these times are over (none of the uses of is_copy matter
for ff_frame_thread_free()); nowadays the only thing that
update_context_from_thread() does is referencing a few
buffers/frames and replacing them with other references instead.
These new references will then be freed immediately thereafter
when the first worker thread is freed. Ensuring that the code is
free of double-frees is achieved by using reference-counted structures
(or in case of AVChannelLayouts: by giving each worker its own copy).
Some archaeology:
a) Updating the first worker thread from the last one used
has been done since frame-threading was added in
37b00b47cb.
b) The precursor to ff_mpv_common_end() checked for is_copy
before freeing pictures (i.e. it only freed them for the first
worker thread).
c) Commits c2dfb1e37c and
e33811bd26 modified the
update_thread_context function of the H.264 decoder
so that it could fail before calling ff_mpeg_update_thread_context().
d) This led to a double free/an assert violation with a H.264
sample for which ff_mpeg_update_thread_context() is not reached
for the final update_context_from_thread(). Commit
a6e4796fbf added code to fix this
sample.
e) This issue was fixed (even with the last mentioned commit reverted)
when the H.264 decoder was deMpegEncContextized in commit
b7fe35c9e5 (merging commit
2c54155407).
f) mpegvideo.c stopped using is_copy when it was switched to refcounted
frames in 759001c534.
g) 1f4cf92cfb removed the init_thread_copy
callbacks; now no FFCodec.close callback checks for is_copy at all
any more.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible, because every given FFCodec has to implement
exactly one of these. Doing so decreases sizeof(FFCodec) and
therefore decreases the size of the binary.
Notice that in case of position-independent code the decrease
is in .data.rel.ro, so that this translates to decreased
memory consumption.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, codec.h contains both public and private parts
of AVCodec. This exposes the internals of AVCodec to users
and leads them into the temptation of actually using them
and forces us to forward-declare structures and types that
users can't use at all.
This commit changes this by adding a new structure FFCodec to
codec_internal.h that extends AVCodec, i.e. contains the public
AVCodec as first member; the private fields of AVCodec are moved
to this structure, leaving codec.h clean.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also move FF_CODEC_TAGS_END as well as struct AVCodecDefault.
This reduces the amount of files that have to include internal.h
(which comes with quite a lot of indirect inclusions), as e.g.
most encoders don't need it. It is furthemore in preparation
for moving the private part of AVCodec out of the public codec.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids including version.h in all source files, avoiding
unnecessary rebuilds when the version number is bumped. Only
version_major.h is included by the main header, which defines
availability of e.g. FF_API_* macros, and which is bumped much
less often.
This isn't done for libavutil/version.h, because that header needs
to be included essentially everywhere due to LIBAVUTIL_VERSION_INT
being used wherever an AVClass is constructed.
Signed-off-by: Martin Storsjö <martin@martin.st>
Since the request_channel_layout is used only by a handful of codecs,
move the option to codec private contexts.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
If a frame-threaded decoder with inter-frame dependencies
returns an error when decoding a frame and the returned frame
isn't clean, an error message is emitted claiming that this
is a bug. This seems to be based upon the thinking that
in this case a ThreadFrame has not been properly unreferenced.
Yet this is wrong, as decoders with inter-frame dependencies
don't use the frame for output for synchronization and therefore
don't use ThreadFrames at all for this. So unreferencing
this frame generically is fine and not a bug.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use ff_thread_release_buffer() instead of av_frame_unref(),
as the former handles the case of non-thread-safe callbacks
properly. (This is possible now that ff_thread_release_buffer()
no longer requires a ThreadFrame.)
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The majority of frame-threaded decoders (mainly the intra-only)
need exactly one part of ThreadFrame: The AVFrame. They don't
need the owners nor the progress, yet they had to use it because
ff_thread_(get|release)_buffer() requires it.
This commit changes this and makes these functions work with ordinary
AVFrames; the decoders that need the extra fields for progress
use ff_thread_(get|release)_ext_buffer() which work exactly
as ff_thread_(get|release)_buffer() used to do.
This also avoids some unnecessary allocations of progress AVBuffers,
namely for H.264 and HEVC film grain frames: These frames are not
used for synchronization and therefore don't need a ThreadFrame.
Also move the ThreadFrame structure as well as ff_thread_ref_frame()
to threadframe.h, the header for frame-threaded decoders with
inter-frame dependencies.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>