Further performance improvements and security fixes by
Vittorio Giovara, Luca Barbato and Diego Biurrun.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
They are not measurably faster on x86, they might be somewhat faster on
other platforms due to missing emu edge SIMD, but the gain is not large
enough to justify the added complexity.
Allow supporting files for which the image stride is smaller than
the maximum block size + number of subpel mc taps, e.g. a 64x64 VP9
file or a 16x16 VP8 file with -fflags +emu_edge.
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This was unnoticed on linux, since stdlib.h apparently includes
files declaring the pthread_mutex_t and pthread_cond_t types.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Testing gives 25-30% gain on HD clips with two threads and
up to 50% gain with eight threads.
Sliced threading uses more memory than single or frame threading.
Frame threading and single threading keep the previous memory
layout.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This prevents gcc from assuming that contents of it may have changed
between calls to vp56_range_get_prob(), thus preventing countless (and
unnecessary) movs. Decoding of sintel trailer goes from (avg+SG) 9.796
+/- 0.003 to 9.635 +/- 0.010.
This properly synchronizes frame size changes between threads if
subsequent threads abort decoding before frame size is initialized, i.e.
it prevents the thread after that from ping-ponging back to the original
value.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Also break some long lines, remove codec function placeholder comments
and add spaces in sample/pixel format lists.
Signed-off-by: Martin Storsjö <martin@martin.st>
lf_delta.ref[i] and lf_delta.mode[i] were incorrectly reset to 0 if
specific delta value was not updated. Fixed to keep the previous value
if flag indicates that element in question is not updated.
Signed-off-by: Janne Salonen <jsalonen@google.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This change avoids accessing the segment map of the previous frame if
segmentation is not enabled for the current frame. The caller of
decode_mb_mode() only calls ff_thread_await_progress() on the reference
segmentation index array if segmentation is enabled, so Chromium's TSAN
will report a race when accessing this data while segmentation is not
enabled.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Also slightly move around code not allocate a new frame if we won't
decode it. This prevents us from putting undecoded frames in frame
pointers, which (in mt decoding) other threads will use and wait on
as references, causing a deadlock (if we skipped decoding) or a crash
(if we didn't initialized next_framep[] at all).
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
A new field, AVCodecContext.internal is used to hold a new struct
AVCodecInternal, which has private fields that are not codec-specific and are
used by general libavcodec functions.
Moved internal_buffer, internal_buffer_count, and is_copy.
Associate segmentation_map[] with reference frame, rather than
decoding instance. This fixes cases where the map would be free()'ed
on e.g. a size change in one thread, whereas the other thread was
still accessing it. Also, it fixes cases where threads overwrite data
that is still being referenced by the previous thread, who thinks that
it's part of the frame previously decoded by the next thread.
In addition to avoiding undefined behaviour, an unsigned type
makes more sense for packing multiple 8-bit values.
Signed-off-by: Mans Rullgard <mans@mansr.com>