In vp9_decode_frame function, ff_thread_finish_setup is not called
when refreshctx is equal to 0, and the next decoding thread can not
start work until the cunrrent frame has been decoded completely. So
ff_thread_finish_setup needs to be called to enable Multi-thread
decoding in this condition.
Signed-off-by: Di Wu <di1028.wu@samsung.com>
Signed-off-by: Reynaldo H. Verdejo Pinochet <reynaldo@osg.samsung.com>
This is needed for future AVX2 implementations
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The advantage of this is that the is32x32 division branch in
decode_coeffs_b is removed from the inner loop to outside the block
coef decoding loop in decode_coeffs. Also, it allows us to merge the
txsz branches from the block coef decoding loop, the context merge
and the context split.
The directional intra predictors either don't care about order (dc, h,
dc_left, tm), or they prefer inverted order (vr, dr, hd). This allows
more efficient SIMD implementations.
The spec doesn't describe how it should be decoded so this is probably
the safest thing to do. Fixes valgrind errors on fuzzed11.ivf and fixes
valgrind errors on fuzzed10.ivf differently.
Fixes invalid reads and crashes in vp90-2-05-resize.webm and fuzzed6.ivf.
The output is still not identical to what libvpx does (because we don't
actually scale in MC).
Reviewed-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Prevents some invalid memory accesses after resolution change in
vp90-2-05-resize.webm, and libvpx does this too.
Reviewed-by: ubitux
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
- The memcpy was completely wrong because
s->prob_ctx[s->framectxid].coef is a [4][2][2][6][6][3] array, whereas
s->prob.coef is a [4][2][2][6][6][11] array.
- The additional check was committed to ffmpeg by Ronald S. Bultje.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The reference map is never used in such cases, but we accidently copied
it anyway. This could cause crashes if this map has not yet been
allocated. Fixes trac ticket 3188.
Reviewed-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It was previously int and would return error if decode_coeffs_b()
returns an error; however, that can never happen, so refactor all
that code to make all dependent functions return void also (all the
way up to decode_coeffs_sb()).
For a random 1080p sample, decoding time went from 9.7sec (1 threads)
to 6.0sec (2 threads) and 5.2sec (4 threads) in 2-pass decoding mode.
I don't have any samples that use the parallelmode feature, but the
gains should be higher.
We need more information from last/cur_frame than from reference
buffers, so we can use a simplified structure for reference buffers,
and then store mvs and segmentation map information in last/cur.
w and h are both read as uint16 + 1 so this can not happen. A similar
change was introduced in 97962b2 / 72ca830, with the
av_log()+AVERROR_INVALIDDATA form, suggesting it could be triggerable
somehow.
Change suggested by Ronald S. Bultje.
Those should not be necessary.
Original change by one of these developers:
Anton Khirnov <anton@khirnov.net>
Diego Biurrun <diego@biurrun.de>
Luca Barbato <lu_zero@gentoo.org>
Martin Storsjö <martin@martin.st>
See 97962b2 / 72ca830
vp8_rac_get_tree() is called with a tree of size 3, so the returned
value can not be outside [0;3]. All of the [0;3] cases are handled in
the switch, so the assert should not be triggerable by any means. A
similar change was introduced in 97962b2 / 72ca830, with the
av_log()+AVERROR_INVALIDDATA form, suggesting it could be triggerable
somehow. This assert might help static analyzer, or simply the reader.
The operands of an addition can be evaluated in any order, since
the addition isn't a sequence point. The only operators that
have a defined evaluation order are &&, ||, ?: and the sequence
operator ','.
This fixes fate-vp9 on ARM RVCT.