Only set the ScratchpadContext and let the users that need it
(i.e. encoders) set the MotionEstContext stuff themselves.
Also add an explicit pointer to ScratchpadContext to point
to the allocated buffer so that none of the other scratchpad
pointers is singled out as being used for the allocations.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These pointers point to the same buffers, so one can just
use a union for them and avoid synchronising one of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Given that MPVPictures are already directly shared between threads
in case of frame-threaded decoding, one can simply use it to
pass decoding progress information between threads. This allows
to avoid one level of indirection; it also means avoids allocations
(of the ThreadFrameProgress structure) in case of frame-threading
and indeed makes ff_thread_release_ext_buffer() decoder-only
(actually, H.264-decoder-only).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, an initialized MpegEncContext had an array of
MPVPictures (way more than were ever needed) and the MPVPicture*
contained in the MPVWorkPictures as well as the input_picture
and reordered_input_picture arrays (for the encoder) pointed
into this array. Several of the pointers could point to the
same slot and because there was no reference counting involved,
one had to check for aliasing before unreferencing.
Furthermore, given that these pointers were not ownership pointers
the pointers were often simply reset without unreferencing
the slot (happened e.g. for the RV30 and RV40 decoders) or
there were moved without resetting the src pointer (happened
for the encoders where the entries in the input_picture
and reordered_input_picture arrays were not reset).
Instead actually releasing these pictures was performed by looping
over the whole array and checking which one of the entries needed
to be kept. Given that the array had way too many slots (36),
this meant that more than 30 MPVPictures have been unnecessarily
unreferenced in every ff_mpv_frame_start(); something similar
happened for the encoder.
This commit changes this by making the MPVPictures refcounted
via the RefStruct API. The MPVPictures itself are part of a pool
so that this does not entail constant allocations; instead,
the amount of allocations actually goes down, because the
earlier code used such a large array of MPVPictures (36 entries) and
allocated an AVFrame for every one of these on every
ff_mpv_common_init(). In fact, the pool is only freed when closing
the codec, so that reinitializations don't lead to new allocations
(this avoids having to sync the pool in update_thread_context).
Making MPVPictures refcounted also has another key benefit:
It makes it possible to directly share them across threads
(when using frame-threaded decoding), eliminating ugly code
with underlying av_frame_ref()'s; sharing these pictures
can't fail any more.
The pool is allocated in ff_mpv_decode_init() for decoders,
which therefore can fail now. This and the fact that the pool
is not unreferenced in ff_mpv_common_end() also necessitated
to mark several mpegvideo-decoders with the FF_CODEC_CAP_INIT_CLEANUP
flag.
*: This also means that there is no good reason any more for
ff_mpv_common_frame_size_change() to exist.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are two types of MPVPictures: Three (cur_pic, last_pic, next_pic)
that are directly part of MpegEncContext and an array of MPVPictures
that are separately allocated and are mostly accessed via pointers
(cur|last|next)_pic_ptr; they are also used to store AVFrames in the
encoder (necessary due to B-frames). As the name implies, each of the
former is directly associated with one of the _ptr pointers:
They actually share the same underlying buffers, but the ones
that are part of the context can have their data pointers offset
and their linesize doubled for field pictures.
Up until now, each of these had their own references; in particular,
there was an underlying av_frame_ref() to sync cur_pic and cur_pic_ptr
etc. This is wasteful.
This commit changes this relationship: cur_pic, last_pic and next_pic
now become MPVWorkPictures; this structure does not have an AVFrame
at all any more, but only the cached values of data and linesize.
It also contains a pointer to the corresponding MPVPicture, establishing
a more natural relationsship between the two.
This already means that creating the context-pictures from the pointers
can no longer fail.
What has not been changed is the fact that the MPVPicture* pointers
are not ownership pointers and that the MPVPictures are part of an
array of MPVPictures that is owned by a single AVCodecContext.
Doing so will be done in a latter commit.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_alloc_picture() currently does two things: It checks the
consistency of the linesize (which should not be necessary, but is)
and it allocates certain buffers. (It does not actually allocate
the picture buffers, so its name is misleading.)
This commit splits it into two separate functions. The rationale
for this is that for the encoders, every picture needs its linesizes
checked, but not every picture needs these extra buffers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Make it clear that this is not a failure of get_buffer/the user,
but a deficit of mpegvideo.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids an indirection and is in preparation for removing
the AVFrame from MpegEncContext.(cur|last|next)_pic altogether.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is the only user of said table and doing so is especially
important given that this buffer is zeroed every time.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It involves less allocations and therefore has less
potential errors to be checked. One consequence thereof
is that updating the picture tables can no longer fail.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This avoids constant allocations+frees and will also allow
to simply switch to the RefStruct API, thereby avoiding
the overhead of the AVBuffer API.
It also simplifies the code, because it removes the "needs_realloc"
field: It was added in 435c0b87d2,
before the introduction of the AVBuffer API: given that these buffers
may be used by different threads, they were not freed immediately
and instead were marked as being freed later by setting needs_realloc.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Codecs call ff_find_unused_picture() to get the index of
an unused picture; said picture may have buffers left
from using it previously (these buffers are intentionally
not unreferenced so that it might be possible to reuse them;
they are only reused when they are writable, otherwise
they are replaced by new, zeroed buffers). They should
not make any assumptions about which picture they get.
Yet this is not true for mbskip_table and damaged bitstreams.
When one returns old unused slots randomly, the output
becomes nondeterministic. This can't happen now (see below),
but it will be possible once mpegpicture uses proper pools
for the picture tables.
The following discussion uses the sample created via
ffmpeg -bitexact -i fate-suite/svq3/Vertical400kbit.sorenson3.mov -ps 50 -bf 2 -bitexact -an -qscale 5 -ss 40 -error_rate 4 -threads 1 out.avi
When decoding this with one thread, the slots are as follows:
Cur 0 (type I), last -1, Next -1; cur refcount -1, not reusing buffers
Cur 1 (type P), last -1, Next 0; cur refcount -1, not reusing buffers
Cur 2 (type B), last 0, Next 1; cur refcount -1, not reusing buffers
Cur 2 (type B), last 0, Next 1; cur refcount 2, not reusing buffers
Cur 0 (type P), last 0, Next 1; cur refcount 2, not reusing buffers
Cur 2 (type B), last 1, Next 0; cur refcount 1, reusing buffers
Cur 2 (type B), last 1, Next 0; cur refcount 2, not reusing buffers
Cur 1 (type P), last 1, Next 0; cur refcount 2, not reusing buffers
Cur 2 (type B), last 0, Next 1; cur refcount 1, reusing buffers
Cur 2 (type B), last 0, Next 1; cur refcount 2, not reusing buffers
Cur 0 (type I), last 0, Next 1; cur refcount 2, not reusing buffers
Cur 2 (type B), last 1, Next 0; cur refcount 1, reusing buffers
Cur 2 (type B), last 1, Next 0; cur refcount 2, not reusing buffers
Cur 1 (type P), last 1, Next 0; cur refcount 2, not reusing buffers
After the slots have been filled initially, the buffers are only
reused for the first B-frame in a B-frame chain:
a) When the new picture is an I or a P frame, the slot of the backward
reference is cleared and reused for the new frame (as has been said,
"cleared" does not mean that the auxiliary buffers have been
unreferenced). Given that not only the slot in the picture array,
but also MpegEncContext.last_picture contain references to these
auxiliary buffers, they are not writable and are therefore not reused,
but replaced by new, zero-allocated buffers.
b) When the new picture is the first B-frame in a B-frame chain,
the two reference slots are kept as-is and one gets a slot that
does not share its auxiliary buffers with any of MpegEncContext.
current_picture, last_picture, next_picture. The buffers are
therefore writable and are reused.
c) When the new picture is a B-frame that is not the first frame
in a B-frame chain, ff_mpv_frame_start() reuses the slot occupied
by the preceding B-frame. Said slot shares its auxilary buffers
with MpegEncContext.current_picture, so that they are not considered
writable and are therefore not reused.
When using frame-threading, the slots are made to match the one
from the last thread, so that the above analysis is mostly the same
with one exception: Other threads may also have references to these
buffers, so that initial B-frames of a B-frame chain need no longer
have writable/reusable buffers. In particular, all I and P-frames
always use new, zeroed buffers. Because only the mbskip_tables of
I- and P-frames are ever used, it follows that there is currently
no problem with using stale values for them at all.
Yet as the analysis shows this is very fragile:
1. MpegEncContext.(current|last|next)_picture need not have
references of their own, but they have them and this influences
the writability decision.
2. It would not work if the slots were returned in a truely random
fashion or if there were a proper pool used.
Therefore this commit always resets said buffer. This is in preparation
for actually adding such a pool (where the checksums for said sample
would otherwise be depending on the number of threads used for
decoding).
Future commits will restrict this to only the codecs for which
it is necessary (namely the MPEG-4 decoder).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Codecs call ff_find_unused_picture() to get the index of
an unused picture; said picture may have buffers left
from using it previously (these buffers are intentionally
not unreferenced so that it might be possible to reuse them;
this is mpegvideo's version of a bufferpool). They should
not make any assumptions about which picture they get.
Yet somehow this is not true when decoding OBMC: Returning
random empty pictures (instead of the first one) leads
to nondeterministic results; similarly, explicitly
rezeroing the buffer before handing it over to the codec
changes the outcome of the h263-obmc tests, but it makes it
independent of the returned pictures. Therefore this commit
does so.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The mpegvideo-based codecs currently require the linesize to be
constant (except when the frame dimensions change); one reason
for this is that certain scratch buffers whose size depend on
linesize are only allocated once and are presumed to be correctly
sized if the pointers are != NULL.
This commit changes this by storing the actual linesize these
buffers belong to and reallocating the buffers if it does not
suffice. This is not enough to actually support changing linesizes,
but it is a start. And it is a prerequisite for the next patch.
Also don't emit an error message in case the source ctx's
edge_emu_buffer is unset in ff_mpeg_update_thread_context().
It need not be an error at all; e.g. it is a perfectly normal
state in case a hardware acceleration is used as the scratch
buffers are not allocated in this case (it is easy to run into
this issue with MPEG-4) or if the src context was not initialized
at all (e.g. because the first packet contained garbage).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are lots of files that don't need it: The number of object
files that actually need it went down from 2011 to 884 here.
Keep it for external users in order to not cause breakages.
Also improve the other headers a bit while just at it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unnecessary since the removal of non-thread-safe callbacks
in e0786a8eeb. Since then, the
AVCodecContext has only been used as logcontext.
Removing ff_thread_release_buffer() allowed to remove AVCodecContext*
parameters from several other functions (not only unref functions,
but also e.g. ff_h264_ref_picture() which calls ff_h264_unref_picture()
on error).
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Since at least commit c954cf1e1b
(adding ff_encode_alloc_frame()), a large part of ff_alloc_picture()
is completely separate for the two callers. Move the caller-specific
parts out to the callers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_alloc_picture() performs two tasks: a) In most instances,
it allocates frame buffers and b) it allocates certain
auxiliary buffers.
The exception to a) is the case when the encoder can reuse
user-supplied frames. And for these frames the auxiliary buffers
are unused, because this frame will never be used as current_picture
(and therefore also not as next_picture or last_picture);
see select_input_picture().
This means that we can simply avoid calling ff_alloc_picture()
with user-supplied frames at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and therefore error checks: Syncing
hwaccel_picture_private across threads can't fail any more.
Also gets rid of an unnecessary pointer in structures and
in the parameter list of ff_hwaccel_frame_priv_alloc().
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Lynne <dev@lynne.ee>
Tested-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
All usages of ff_hwaccel_frame_priv_alloc() have the same pattern:
Check for whether a hwaccel is in use; check whether it needs
private frame-specific data; allocate the AVBuffer and set
it.
This commit modifies ff_hwaccel_frame_priv_alloc() to perform
this task on its own.
(It also seems that the H.264 decoder did not perform proper
cleanup in case the buffer could not be allocated. This has been
changed.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Improves the grepability of the code.
(Furthermore, I hope that no compiler will really call memset
for 28 bytes.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
mpegvideo uses an array of Pictures and when it is done with using
them, it only unreferences them incompletely: Some buffers are kept
so that they can be reused lateron if the same slot in the Picture
array is reused, making this a sort of a bufferpool.
(Basically, a Picture is considered used if the AVFrame's buf is set.)
Yet given that other pieces of the decoder may have a reference to
these buffers, they need not be writable and are made writable using
av_buffer_make_writable() when preparing a new Picture. This involves
reading the buffer's data, although the old content of the buffer
need not be retained.
Worse, this read can be racy, because the buffer can be used by another
thread at the same time. This happens for Real Video 3 and 4.
This commit fixes this race by no longer copying the data;
instead the old buffer is replaced by a new, zero-allocated buffer.
(Here are the details of what happens with three or more decoding threads
when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
The first decoding thread uses the first slot of its picture array
to store its current pic; update_thread_context copies this for the
second thread that decodes a P-frame. It uses the second slot in its
Picture array to store its P-frame. This arrangement is then copied
to the third decode thread, which decodes a B-frame. It uses the third
slot in its Picture array for its current frame.
update_thread_context copies this to the next thread. It unreferences
the third slot containing the other B-frame and then it reuses this
slot for its current frame. Because the pic array slots are only
incompletely unreferenced, the buffers of the previous B-frame are
still in there and they are not writable; in fact the previous
thread is concurrently writing to them, causing races when making
the buffer writable.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These fields are only ever set by the encoder for the current picture
and for no other picture. So only one set of these values needs to
exist, so move them to MpegEncContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all the buffers that are made writable, three are always allocated
and the other four are allocated iff any one of them is allocated;
so one can replace the seven checks for existence with one.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These tables are only used by encoders and only for the current picture;
ergo they need not be put into the picture at all, but rather into
the encoder's context. They also don't need to be refcounted,
because there is only one owner.
In contrast to this, the earlier code refcounts them which
incurs unnecessary overhead. These references are not unreferenced
in ff_mpeg_unref_picture() (they are kept in order to have something
like a buffer pool), so that several buffers are kept at the same
time, although only one is needed, thereby wasting memory.
The code also propagates references to other pictures not part of
the pictures array (namely the copy of the current/next/last picture
in the MpegEncContext which get references of their own). These
references are not unreferenced in ff_mpeg_unref_picture() (the
buffers are probably kept in order to have something like a pool),
yet if the current picture is a B-frame, it gets unreferenced
at the end of ff_mpv_encode_picture() and its slot in the picture
array will therefore be reused the next time; but the copy of the
current picture also still has its references and therefore
these buffers will be made duplicated in order to make them writable
in the next call to ff_mpv_encode_picture(). This is of course
unnecessary.
Finally, ff_find_unused_picture() is supposed to just return
any unused picture and the code is supposed to work with it;
yet for the vsynth*-mpeg4-adap tests the result depends upon
the content of these buffers; given that this patchset
changes the content of these buffers (the initial content is now
the state of these buffers after encoding the last frame;
before this patch the buffers used came from the last picture
that occupied the same slot in the picture array) their ref-files
needed to be changed. This points to a bug somewhere (if one removes
the initialization, one gets uninitialized reads in
adaptive_quantization in ratecontrol.c).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Several encoders (roqvideo, svq1, snow, and the mpegvideo family)
currently call ff_get_buffer(). However this function is written
assuming it is called by a decoder. Though nothing has been obviously
broken by this until now, that may change in the future.
To avoid potential future issues, introduce a simple encode-specific
wrapper around avcodec_default_get_buffer2() and enforce its use in
encoders.
These will be used by the codecs that need allocated progress
and is in preparation for no longer using ThreadFrame by the codecs
that don't.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also use said function in mpegvideo.c and mpegvideo_enc.c;
and make ff_free_picture_tables() static as it isn't needed anymore
outside of mpegpicture.c.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Do it only when requested with the AV_CODEC_EXPORT_DATA_VIDEO_ENC_PARAMS
flag.
Drop previous code using the long-deprecated AV_FRAME_DATA_QP_TABLE*
API. Temporarily disable fate-filter-pp, fate-filter-pp7,
fate-filter-spp. They will be reenabled once these filters are converted
in following commits.
Before adding uvlinesize check, I was seeing failures decoding
some video with ffmpeg compiled with --enable-gray and using AV_CODEC_FLAG_GRAY.
[mpeg2video @ 0x7fa193818c00] get_buffer() failed (stride changed: linesize=1280/1280 uvlinesize=0/640)
[mpeg2video @ 0x7fa193818c00] get_buffer() failed (stride changed: linesize=1280/1280 uvlinesize=0/640)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
In function ff_mpeg_ref_picture(), it returns 0 on the error path that
the return value of av_buffer_ref() is NULL. 0 indicates success, which
seems to deviate from the fact. Set ret to AVERROR(ENOMEM) to propagate
the error status to the callers.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
alloc_frame_buffer in ff_alloc_picture asserts that the linesize
of planes 1 and 2 are the same. If the pixfmt has a single uv
plane, like NV12, this won't be true.
So, let's only do this check if there are more than 2 planes.
We never hit this with previous hw formats because they don't set
linesize to meaningful values, but the cuda hw format sets the
values based on the underlying data layout.