Film grain support adds a huge amount of overhead to the H264Context
structure for a feature that is rarely used. On low end devices or
pages that have lots of media this bloats memory usage rapidly.
This changes the static film grain metadata allocations to be dynamic
which reduces the H264Context size from 851808 bytes to 53444 bytes.
Bug: https://crbug.com/359358875
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Niklas Haas <git@haasn.dev>
It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail.
This allows for considerable simplifications in
ff_h264_(ref|replace)_picture().
Switching to the RefStruct API also allows to make H264Picture
smaller, because some AVBufferRef* pointers could be removed
without replacement.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary since the removal of non-thread-safe callbacks
in e0786a8eeb. Since then, the
AVCodecContext has only been used as logcontext.
Removing ff_thread_release_buffer() allowed to remove AVCodecContext*
parameters from several other functions (not only unref functions,
but also e.g. ff_h264_ref_picture() which calls ff_h264_unref_picture()
on error).
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and therefore error checks: Syncing
hwaccel_picture_private across threads can't fail any more.
Also gets rid of an unnecessary pointer in structures and
in the parameter list of ff_hwaccel_frame_priv_alloc().
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Lynne <dev@lynne.ee>
Tested-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks for these allocations;
e.g. syncing buffers across threads can't fail any more
and needn't be checked. It also avoids having to keep
H264ParamSets.pps and H264ParamSets.pps_ref and PPS.sps
and PPS.sps_ref in sync and gets rid of casts and indirections.
(The removal of these checks and the syncing code even more
than offset the additional code for RefStruct.)
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When using multi-threaded decoding, every decoding thread
has its own DBP consisting of H264Pictures and each of these
points to its own AVFrames. They are synced during
update_thread_context via av_frame_ref() and therefore
the threads actually decoding (as well as all the others)
must not modify any field that is copied by av_frame_ref()
after ff_thread_finish_setup().
Yet this is exactly what happens when an error occurs
during decoding and the AVFrame's decode_error_flags are updated.
Given that these errors only become apparent during decoding,
this can't be set before ff_thread_finish_setup() without
defeating the point of frame-threading; in practice,
this meant that the decoder did not set these flags correctly
in case frame-threading was in use. (This means that e.g.
the ffmpeg cli tool fails to output its "corrupt decoded frame"
message in a nondeterministic fashion.)
This commit fixes this by adding a new H264Picture field
that is actually propagated across threads; the field
is an AVBufferRef* whose data is an atomic_int; it is
atomic in order to allow multiple threads to update it
concurrently and not to provide synchronization
between the threads setting the field and the thread
ultimately returning the AVFrame.
This unfortunately has the overhead of one allocation
per H264Picture (both the original one as well as
creating a reference to an existing one), even in case
of no errors. In order to mitigate this, an AVBufferPool
has been used and only if frame-threading is actually
in use. This expense will be removed as soon as
a proper API for refcounted objects (not based upon
AVBuffer) is in place.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit is the AVHWAccel analogue of commit
20f972701806be20a77f808db332d9489343bb78: It moves the private fields
of AVHWAccel to a new struct FFHWAccel extending AVHWAccel
in an internal header (namely hwaccel_internal.h).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The majority of frame-threaded decoders (mainly the intra-only)
need exactly one part of ThreadFrame: The AVFrame. They don't
need the owners nor the progress, yet they had to use it because
ff_thread_(get|release)_buffer() requires it.
This commit changes this and makes these functions work with ordinary
AVFrames; the decoders that need the extra fields for progress
use ff_thread_(get|release)_ext_buffer() which work exactly
as ff_thread_(get|release)_buffer() used to do.
This also avoids some unnecessary allocations of progress AVBuffers,
namely for H.264 and HEVC film grain frames: These frames are not
used for synchronization and therefore don't need a ThreadFrame.
Also move the ThreadFrame structure as well as ff_thread_ref_frame()
to threadframe.h, the header for frame-threaded decoders with
inter-frame dependencies.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These will be used by the codecs that need allocated progress
and is in preparation for no longer using ThreadFrame by the codecs
that don't.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is in preparation for further commits that will stop
using ThreadFrame for frame-threaded codecs that don't use
ff_thread_(await|report)_progress(); the API for those codecs
having inter-frame depdendencies will live in threadframe.h.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is only needed by h264_cabac.c and h264_cavlc.c.
Also fix up the other headers while at it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If a decoding error happens before frame side data is allocated, this assert may be
triggered. And since applying film grain is not enforced (we just warn it wasn't
applied and move on), we can just do that in such scenarios.
Fixes: Assertion failure
Fixes: clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5528650032742400
Signed-off-by: James Almer <jamrial@gmail.com>
Because we need access to ref frames without film grain applied, we have
to add an extra AVFrame to H264Picture to avoid messing with the
original. This requires some amount of overhead to make the reference
moves work out, but it allows us to benefit from frame multithreading
for film grain application "for free".
Unfortunately, this approach requires twice as much RAM to be constantly
allocated for ref frames, due to the need for an extra buffer per
H264Picture. In theory, we could get away with freeing up this memory as
soon as it's no longer needed (since ref frames do not need film grain
buffers any longer), but trying to call ff_thread_release_buffer() from
output_frame() conflicts with possible later accesses to that same frame
and I'm not sure how to synchronize that well.
Tested on all three cases of (no fg), (fg present but exported) and (fg
present and not exported), with and without threading.
Co-authored-by: James Almer <jamrial@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
Signed-off-by: James Almer <jamrial@gmail.com>
Will remove unnecessary allocations when both src and dst picture contain
references to the same buffers.
Signed-off-by: James Almer <jamrial@gmail.com>
as well as includes of libavutil/timer.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is how the ref list manager links bitstream IDs to H264Picture/Ref
objects, and is local to the producer thread. There is no need for the
consumer thread to know the bitstream IDs of its references in their
respective producer threads.
In practice, this fixes tsan warnings when running fate-h264:
WARNING: ThreadSanitizer: data race (pid=19295)
Read of size 4 at 0x7dbc0000e614 by main thread (mutexes: write M1914):
#0 ff_h264_ref_picture src/libavcodec/h264_picture.c:112 (ffmpeg+0x0000013b3709)
[..]
Previous write of size 4 at 0x7dbc0000e614 by thread T2 (mutexes: write M1917):
#0 build_def_list src/libavcodec/h264_refs.c:91 (ffmpeg+0x0000013b46cf)
Calling ff_h264_field_end() when the per-field state is not properly
initialized leads to all kinds of undefined behaviour.
CC: libav-stable@libav.org
Bug-Id: 977 978 992
Do it right before the MMCOs are applied to the DPB. This will allow
moving the frame_start() call out of the slice header parsing, since
generating the implicit MMCOs needs to be done after frame_start().
At least the new videotoolbox decoder does not actually set a frame if
end_frame fails. This causes the API to return success and signals that
a picture was decoded, even though AVFrame->data[0] is NULL.
Fix this by propagating end_frame errors.
h264.h and hevc.h are mutually exclusive due to defining some of the same
names. As such, we need to avoid forcing h264.h to be included if we want
hevc decode acceleration to be possible.
However, some of the pre-hwaccel helper functions need h264.h. To avoid
messy collisions, let's move the declaration of all those helpers to
a separate header which we will exclude for the hevc support (which will
be hwaccel-only).
Signed-off-by: Philip Langdale <philipl@overt.org>
Also move EC ref initialization to where the EC code is called.
Fixes out of array read
Fixes: asan_heap-uaf_143f420_142_20110805_112659_ch0.mkv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>