That variable is shared between frame threads in the same defective way
described in the previous commit. Fix it by adding a RefStruct-managed
arrays of flags that is propagated across frame threads in the standard
manner.
Remove now-unused FFV1Context.fsrc
Frame threading in the FFV1 decoder works in a very unusual way - the
state that needs to be propagated from the previous frame is not decoded
pixels(¹), but each slice's entropy coder state after decoding the slice.
For that purpose, the decoder's update_thread_context() callback stores
a pointer to the previous frame thread's private data. Then, when
decoding each slice, the frame thread uses the standard progress
mechanism to wait for the corresponding slice in the previous frame to
be completed, then copies the entropy coder state from the
previously-stored pointer.
This approach is highly dubious, as update_thread_context() should be
the only point where frame-thread contexts come into direct contact.
There are no guarantees that the stored pointer will be valid at all, or
will contain any particular data after update_thread_context() finishes.
More specifically, this code can break due to the fact that keyframes
reset entropy coder state and thus do not need to wait for the previous
frame. As an example, consider a decoder process with 2 frame threads -
thread 0 with its context 0, and thread 1 with context 1 - decoding a
previous frame P, current frame F, followed by a keyframe K. Then
consider concurrent execution consistent with the following sequence of
events:
* thread 0 starts decoding P
* thread 0 reads P's slice header, then calls
ff_thread_finish_setup() allowing next frame thread to start
* main thread calls update_thread_context() to transfer state from
context 0 to context 1; context 1 stores a pointer to context 0's private
data
* thread 1 starts decoding F
* thread 1 reads F's slice header, then calls
ff_thread_finish_setup() allowing the next frame thread to start
decoding
* thread 0 finishes decoding P
* thread 0 starts decoding K; since K is a keyframe, it does not
wait for F and reallocates the arrays holding entropy coder state
* thread 0 finishes decoding K
* thread 1 reads entropy coder state from its stored pointer to context
0, however it finds state from K rather than from P
This execution is currently prevented by special-casing FFV1 in the
generic frame threading code, however that is supremely ugly. It also
involves unnecessary copies of the state arrays, when in fact they can
only be used by one thread at a time.
This commit addresses these deficiencies by changing the array of
PlaneContext (each of which contains the allocated state arrays)
embedded in FFV1SliceContext into a RefStruct object. This object can
then be propagated across frame threads in standard manner. Since the
code structure guarantees only one thread accesses it at a time, no
copies are necessary. It is also re-created for keyframes, solving the
above issue cleanly.
Special-casing of FFV1 in the generic frame threading code will be
removed in a later commit.
(¹) except in the case of a damaged slice, when previous frame's pixels
are used directly
In all cases except decoding version 1 it's either not used, or contains
a copy of a table from quant_tables, which we can just as well use
directly.
When decoding version 1, we can just as well decode into
quant_tables[0], which would otherwise be unused.
FFV1 decoder and encoder currently use the same struct - FFV1Context -
both as codec private data and per-slice context. For this purpose
FFV1Context contains an array of pointers to per-slice FFV1Context
instances.
This pattern is highly confusing, as it is not clear which fields are
per-slice and which per-codec.
Address this by adding a new struct storing only per-slice data. Start
by moving slice_{x,y,width,height} to it.
Avoids implicit av_frame_ref() and therefore allocations
and error checks. It also avoids explicitly allocating
the AVFrames (done implicitly when getting the buffer).
It also fixes a data race: The AVFrame's sample_aspect_ratio
is currently updated after ff_thread_finish_setup()
and this write is unsynchronized with the read in av_frame_ref().
Removing the implicit av_frame_ref() fixed this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Both the FFV1 decoder and encoder use a template of their own
to generate code multiple times. They also use a common template,
used by both decoder and encoder templates which is currently
instantiated in ffv1.h (and therefore also in ffv1.c, which
doesn't need it at all).
All these templates have the prerequisite that two macros
are defined, namely RENAME() and TYPE. The codec-specific
templates call the functions generated via the common template
via the RENAME() macro and therefore the macros used for
the common template must coincide with the macros used for
the codec-specific templates. But then it is better to not
instantiate the common template in ffv1.h, but in the codec
specific templates.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The majority of frame-threaded decoders (mainly the intra-only)
need exactly one part of ThreadFrame: The AVFrame. They don't
need the owners nor the progress, yet they had to use it because
ff_thread_(get|release)_buffer() requires it.
This commit changes this and makes these functions work with ordinary
AVFrames; the decoders that need the extra fields for progress
use ff_thread_(get|release)_ext_buffer() which work exactly
as ff_thread_(get|release)_buffer() used to do.
This also avoids some unnecessary allocations of progress AVBuffers,
namely for H.264 and HEVC film grain frames: These frames are not
used for synchronization and therefore don't need a ThreadFrame.
Also move the ThreadFrame structure as well as ff_thread_ref_frame()
to threadframe.h, the header for frame-threaded decoders with
inter-frame dependencies.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
as well as includes of libavutil/timer.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
No speed difference, or slightly faster (the difference is too small so it may be noise
that this appears faster)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is required for >= 16bit RGB support
I tried it without templates but its too much duplicated code
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This option is only used by ffv1 and ffvhuff.
It is a very codec-specific option, so deprecate the global variant.
Improve documentation a little.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
FFv1 uses two types of coders, golomb and range with two different
tables. This is exposed this in a rather convoluted way, for example
mentioning to set coder type 1 while initializing the variable 'ac' to 2,
because encoder does not use range coder with default table.
Appropriate internal coder type values have been added and used in any
check rather than using raw numbers.
Initialization of avctx.coder_type in ffv1dec is removed because this
field is encoder only. An unneeded validation check in the encoder
is dropped too.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>