Fixes the following tsan warning when running fate-vsynth_lena-ffvhuff:
WARNING: ThreadSanitizer: data race (pid=6484)
Write of size 8 at 0x7d64000154b8 by main thread (mutexes: write M1331):
#0 update_context_from_user src/libavcodec/pthread_frame.c:331 (ffmpeg+0x000000dca887)
[..]
Previous read of size 8 at 0x7d64000154b8 by thread T2 (mutexes: write M1334):
#0 draw_slice src/libavcodec/huffyuvdec.c:857 (ffmpeg+0x000000bcc86f)
When compiled with --disable-pthreads, e.g
http://fate.ffmpeg.org/report.cgi?time=20150917015044&slot=alpha-debian-qemu-gcc-4.7,
a bunch of -Wunused-functions are reported due to missing header guards
around threading related functions.
This patch should silence such warnings.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Fixes out of array read
Fixes: signal_sigsegv_3287332_2301_cov_2994954934_huffyuv.avi
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Bug-Id: CVE-2013-0868
inspired by a patch from Michael Niedermayer <michaelni@gmx.at>
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Diego Biurrun <diego@biurrun.de>
CC: libav-stable@libav.org
The reader reads in chunks of 11 bits at most, and at most 3 times. The unsafe
reader therefore may read 6 chunks instead of 1 in worst case, ie 8 bytes,
which is within the padding tolerance.
The reader ends up being ~10% faster. Cumulative effect of unsafe reading and
code block swapping on 3 sequences is for 1 thread, decoding time goes from
23.3s to 19.0s.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The old code was reserving the 0xFFFF entry to represent an inexisting
entry/codeword. These entries are now detected through their length
being <= 0. As this entry is often used for the residuals (-1,-1), which
should be among the most frequent, it is particularly important to not
reserve it.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The effect is not really deterministic, as it seems to be a combination
on x86_64 of fewer registers used, different jump offsets and, for all
archs, of likely branches.
Speedup is around 15%.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
An invalid entry already has the property of having a negative number
of bits, so remove the check on the reserved value, and rearrange the
code as a consequence.
346800 decicycles in 422, 262079 runs, 65 skips
168197 decicycles in gray, 262077 runs, 67 skips
Overall time: 7.878s
319076 decicycles in 422, 262096 runs, 48 skips
159875 decicycles in gray, 262057 runs, 87 skips
Overall time: 7.394s
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It's no longer used inside another specific macro, so rename it.
Also remove duplicated definition and realign code.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
When the joint table does not contain a valid entry, the decoding restarts
from scratch. By implementing the trick of jumping to the 2nd level of the
individual table (and inlining the whole), a speed improvement of 5-10%
is possible.
On a 1000-frames YUV4:2:0 video, before:
362851 decicycles in 422, 262094 runs, 50 skips
182488 decicycles in gray, 262087 runs, 57 skips
Object size: 23584
Overall time: 8.377
After:
346800 decicycles in 422, 262079 runs, 65 skips
168197 decicycles in gray, 262077 runs, 67 skips
Object size: 23188
Overall time: 7.878
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>