By checking immediately whether the first allocation was successfull
one can simplify the cleanup code in case of errors.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
APNG works with a single reference frame and an output frame.
According to the spec, decoding APNG works by decoding
the current IDAT/fdAT chunks (which decodes to a rectangular
subregion of the whole image region), followed by either
overwriting the region of the output frame with the newly
decoded data or by blending the newly decoded data with
the data from the reference frame onto the current subregion
of the output frame. The remainder of the output frame
is just copied from the reference frame.
Then the reference frame might be left untouched
(APNG_DISPOSE_OP_PREVIOUS), it might be replaced by the output
frame (APNG_DISPOSE_OP_NONE) or the rectangular subregion
corresponding to the just decoded frame has to be reset
to black (APNG_DISPOSE_OP_BACKGROUND).
The latter case is not handled correctly by our decoder:
It only performs resetting the rectangle in the reference frame
when decoding the next frame; and since commit
b593abda6c it does not reset
the reference frame permanently, but only temporarily (i.e.
it only affects decoding the frame after the frame with
APNG_DISPOSE_OP_BACKGROUND). This is a problem if the
frame after the APNG_DISPOSE_OP_BACKGROUND frame uses
APNG_DISPOSE_OP_PREVIOUS, because then the frame after
the APNG_DISPOSE_OP_PREVIOUS frame has an incorrect reference
frame. (If it is not followed by an APNG_DISPOSE_OP_PREVIOUS
frame, the decoder only keeps a reference to the output frame,
which is ok.)
This commit fixes this by being much closer to the spec
than the earlier code: Resetting the background is no longer
postponed until the next frame; instead it is applied to
the reference frame.
Fixes ticket #9602.
(For multithreaded decoding it was actually already broken
since commit 5663301560d77486c7f7c03c1aa5f542fab23c24.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add adaptive_i/b feature to hevc_qsv. Adaptive_i allows changing of
frame type from P and B to I. Adaptive_b allows changing of frame type
frome B to P.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Just return AVERROR_EXTERNAL immediately when encode error.
The other logic should keep the old behavior before commit 7c05b7951.
Suggested-By: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
This commit moves the encoder-only allocations of the tables
owned solely by the main encoder context to mpegvideo_enc.c.
This avoids checks in mpegvideo.c for whether we are dealing
with an encoder; it also improves modularity (if encoders are
disabled, this code will no longer be included in the binary).
And it also avoids having to reset these pointers at the beginning
of ff_mpv_common_init() (in case the dst context is uninitialized,
ff_mpeg_update_thread_context() simply copies the src context
into dst which therefore may contain pointers not owned by it,
but this does not happen for encoders at all).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes frame-threaded decoding of samples created by concatting
a video with data partitioning and a video not using it.
(Only the MPEG-4 decoder sets this, so it is synced in
mpeg4_update_thread_context() despite being a MpegEncContext-field.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
IEEE-754 differentiates two different kind of NaNs.
Quiet and Signaling ones. They are differentiated by the MSB of the
mantissa.
For whatever reason, actual hardware conversion of half to single always
sets the signaling bit to 1 if the mantissa is != 0, and to 0 if it's 0.
So our code has to follow suite or fate-testing hardware float16 will be
impossible.
They are already synced generically in update_context_from_thread()
in pthread_frame.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is unused since 3575a495f6
(and the error message is dangerous: av_get_pix_fmt_name(format)
returns NULL iff av_pix_fmt_desc_get(format) returns NULL
and using a NULL string for %s would be UB).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Improves the grepability of the code.
(Furthermore, I hope that no compiler will really call memset
for 28 bytes.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is done later in ff_mpv_frame_start() (and nobody uses
current_picture_ptr between setting it in ff_mpv_frame_start()).
(The reason the vsynth*-h263-obmc ref files change is because
the call to ff_find_unused_picture() now happens after the older
pictures have been unreferenced in ff_mpv_frame_start(),
so that their slots in the picture array can be immediately
reused; the obmc code is somehow buggy and changes its output
depending on the earlier contents of the motion_val buffer.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Provide optimized implementation of pix_abs8 function for arm64.
Performance comparison tests are shown below.
- pix_abs_1_0_c: 101.2
- pix_abs_1_0_neon: 22.5
- sad_1_c: 101.2
- sad_1_neon: 22.5
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of sse8 function for arm64.
Performance comparison tests are shown below.
- sse_1_c: 130.7
- sse_1_neon: 29.7
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation of pix_abs16_y2 function for arm64.
Performance comparison tests are shown below.
pix_abs_0_2_c: 317.2
pix_abs_0_2_neon: 37.5
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide neon implementation for sse4 function.
Performance comparison tests are shown below.
- sse_2_c: 80.7
- sse_2_neon: 31.0
Benchmarks and tests are run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide neon implementation for sse16 function.
Performance comparison tests are shown below.
- sse_0_c: 268.2
- sse_0_neon: 43.5
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Since d69d12a5b9 these av_assert2()
(or more exactly, the ones in hadamard8_diff8x8_c() and
hadamard8_intra8x8_c()) are hit. So just remove all of these asserts.
(If the test were improved to know which functions expect h == 8
and which support any value, the asserts could be readded
at the appropriate places.)
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The height is hardcoded in some of the me_cmp functions, but not
in all of them. But in the case of all other functions, it's hardcoded
in the same place in SIMD functions as in the C reference functions,
while this one function differs from the behaviour of the C code.
(Before 542765ce3e, there were a
couple other sad8_*_mmx functions with similar hardcoded height.)
Signed-off-by: Martin Storsjö <martin@martin.st>
mpegvideo uses an array of Pictures and when it is done with using
them, it only unreferences them incompletely: Some buffers are kept
so that they can be reused lateron if the same slot in the Picture
array is reused, making this a sort of a bufferpool.
(Basically, a Picture is considered used if the AVFrame's buf is set.)
Yet given that other pieces of the decoder may have a reference to
these buffers, they need not be writable and are made writable using
av_buffer_make_writable() when preparing a new Picture. This involves
reading the buffer's data, although the old content of the buffer
need not be retained.
Worse, this read can be racy, because the buffer can be used by another
thread at the same time. This happens for Real Video 3 and 4.
This commit fixes this race by no longer copying the data;
instead the old buffer is replaced by a new, zero-allocated buffer.
(Here are the details of what happens with three or more decoding threads
when decoding rv30.rm from the FATE-suite as happens in the rv30 test:
The first decoding thread uses the first slot of its picture array
to store its current pic; update_thread_context copies this for the
second thread that decodes a P-frame. It uses the second slot in its
Picture array to store its P-frame. This arrangement is then copied
to the third decode thread, which decodes a B-frame. It uses the third
slot in its Picture array for its current frame.
update_thread_context copies this to the next thread. It unreferences
the third slot containing the other B-frame and then it reuses this
slot for its current frame. Because the pic array slots are only
incompletely unreferenced, the buffers of the previous B-frame are
still in there and they are not writable; in fact the previous
thread is concurrently writing to them, causing races when making
the buffer writable.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
At this point active_thread_type is set iff active_thread_type
is set to FF_THREAD_FRAME iff AVCodecInternal.frame_thread_encoder
is set.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If qsv hwdevice is available, use the mfxLoader handle in qsv hwdevice
to create mfx session. Otherwise create mfx session with a new mfxLoader
handle.
This is in preparation for oneVPL support
The following Cflags has been added to libmfx.pc, so mfx/ prefix is no
longer needed when including mfx headers in FFmpeg.
Cflags: -I${includedir} -I${includedir}/mfx
Some old versions of libmfx have the following Cflags in libmfx.pc
Cflags: -I${includedir}
We may add -I${includedir}/mfx to CFLAGS when running 'configure
--enable-libmfx' for old versions of libmfx, if so, mfx headers without
mfx/ prefix can be included too.
If libmfx comes without pkg-config support, we may do a small change to
the settings of the environment(e.g. set -I/opt/intel/mediasdk/include/mfx
instead of -I/opt/intel/mediasdk/include to CFLAGS), then the build can
find the mfx headers without mfx/ prefix
After applying this change, we won't need to change #include for mfx
headers when mfx headers are installed under a new directory.
This is in preparation for oneVPL support (mfx headers in oneVPL are
installed under vpl directory)
It is the proper place to set it, directly besides mb_width and
mb_stride. The reason for doing it the way it is done now seems
to be that the code does not create more slice contexts than necessary
(i.e. not more than one per row), so that this number needs to be
known before setting the number of slices. But this can always be
arranged by just moving the code that sets the number of slices.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>