Avoids allocations and therefore error checks: Syncing
hwaccel_picture_private across threads can't fail any more.
Also gets rid of an unnecessary pointer in structures and
in the parameter list of ff_hwaccel_frame_priv_alloc().
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Lynne <dev@lynne.ee>
Tested-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids allocations and error checks when syncing the buffers.
Also avoids indirections.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only an auxiliary value used for parsing the VP7 frame header.
Reviewed-by: Peter Ross <pross@xvid.org>
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead replace VP56mv by new and identical structures VP8mv and VP9mv.
Also replace VP56Frame by VP8FrameType in vp8.h and use that
in VP8 code. Also remove VP56_FRAME_GOLDEN2, as this has only
been used by VP8, and use VP8_FRAME_ALTREF as replacement for
its usage in VP8 as this is more in line with VP8 verbiage.
This allows to remove all inclusions of vp56.h from everything
that is not VP5/6. This also removes implicit inclusions
of hpeldsp.h, h264chroma.h, vp3dsp.h and vp56dsp.h from all VP8/9
files.
(This also fixes a build issue: If one compiles with -O0 and disables
everything except the VP8-VAAPI encoder, the file containing
ff_vpx_norm_shift is not compiled, yet this is used implicitly
by vp56_rac_gets_nn() which is defined in vp56.h; it is unused
by the VP8-VAAPI encoder and declared as av_unused, yet with -O0
unused noninline functions are not optimized away, leading to
linking failures. With this patch, said function is not included
in vaapi_encode_vp8.c any more.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This increases type-safety by avoiding conversions from/through void*.
It also avoids the boilerplate "AVFrame *frame = data;" line
for non-subtitle decoders.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The majority of frame-threaded decoders (mainly the intra-only)
need exactly one part of ThreadFrame: The AVFrame. They don't
need the owners nor the progress, yet they had to use it because
ff_thread_(get|release)_buffer() requires it.
This commit changes this and makes these functions work with ordinary
AVFrames; the decoders that need the extra fields for progress
use ff_thread_(get|release)_ext_buffer() which work exactly
as ff_thread_(get|release)_buffer() used to do.
This also avoids some unnecessary allocations of progress AVBuffers,
namely for H.264 and HEVC film grain frames: These frames are not
used for synchronization and therefore don't need a ThreadFrame.
Also move the ThreadFrame structure as well as ff_thread_ref_frame()
to threadframe.h, the header for frame-threaded decoders with
inter-frame dependencies.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_get_format() in the next patch will reject formats which aren't in the
offered list, so the hack in 7cb9296db8 is
no longer valid. Change the hack by adding a new field in the VP8 decoder
context to indicate that it's actually WebP and don't call ff_get_format()
at all in that case.
Also adds some extra fields to the main context structure that may
be needed by a hwaccel decoder.
The current behaviour of the WebP decoder is maintained by adding an
additional field to the VP8 decoder private context to indicate that
it is actually being used as WebP (no hwaccel is supported for that
case).
Fixes tsan warnings like this in fate-vp8-test-vector-007:
WARNING: ThreadSanitizer: data race (pid=65909)
Write of size 4 at 0x7d8c0000e088 by thread T1:
#0
An error occurred
vp8_decode_mb_row_sliced vp8.c:2519 (ffmpeg:x86_64+0x100995ede)
[..]
Previous write of size 4 at 0x7d8c0000e088 by thread T2:
#0
Fixes tsan warnings like this in fate-vp8-test-vector-007:
WARNING: ThreadSanitizer: data race (pid=3590)
Write of size 4 at 0x7d8c0000e07c by thread T2:
#0
An error occurred
decode_mb_row_no_filter src/libavcodec/vp8.c:2330 (ffmpeg+0x000000ffb59e)
[..]
Previous write of size 4 at 0x7d8c0000e07c by thread T1:
#0
Fixes timeout with 686/clusterfuzz-testcase-5853946876788736
this shortcuts (i.e. speeds up) the error and
return-to-user when decoding a truncated frame
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Previous version reviewed by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If one of the dimensions is larger than 8176, s->mb_width or
s->mb_height is larger than 511, leading to an int16_t overflow of
s->mv_max.{x,y}. This then causes av_clip to be called with amin > amax.
Changing the type to int avoids the overflow and has no negative
effect, because s->mv_max is only used in clamp_mv for clipping.
Since mv_max.{x,y} is positive and mv_min.{x,y} negative, av_clip can't
increase the absolute value. The input to av_clip is an int16_t, and
thus the output fits into int16_t as well.
For additional safety, s->mv_{min,max}.{x,y} are clipped to int16_t range
before use.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
This array is written using AV_WN32A, assuming alignment.
This hopefully fixes the failing vp7 fate test on sparc.
Signed-off-by: Martin Storsjö <martin@martin.st>
Further performance improvements and security fixes by
Vittorio Giovara, Luca Barbato and Diego Biurrun.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This allows supporting files for which the image stride is smaller than
the max. block size + number of subpel mc taps, e.g. a 64x64 VP9 file
or a 16x16 VP8 file with -fflags +emu_edge.
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Reordering the members in this struct reduces the holes required
to maintain alignment. With this order, the only remaining, and
unavoidable, hole is 3 bytes following left_nnz.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This was unnoticed on linux, since stdlib.h apparently includes
files declaring the pthread_mutex_t and pthread_cond_t types.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Testing gives 25-30% gain on HD clips with two threads and
up to 50% gain with eight threads.
Sliced threading uses more memory than single or frame threading.
Frame threading and single threading keep the previous memory
layout.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Associate segmentation_map[] with reference frame, rather than
decoding instance. This fixes cases where the map would be free()'ed
on e.g. a size change in one thread, whereas the other thread was
still accessing it. Also, it fixes cases where threads overwrite data
that is still being referenced by the previous thread, who thinks that
it's part of the frame previously decoded by the next thread.