Fixes: timeout in 730/clusterfuzz-testcase-5265113739165696 (part 1 of 2)
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Reviewed-by: BBB
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes timeout with 700/clusterfuzz-testcase-5660909504561152
Fixes timeout with 702/clusterfuzz-testcase-4553541576294400
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.
Also remove the unused type == 0 cases from the plain C version
of the idct.
Signed-off-by: Martin Storsjö <martin@martin.st>
This way, the special IDCT permutations are no longer needed. Bfin code
is disabled until someone updates it. This is similar to how H264 does
it, and removes the dsputil dependency imposed by the scantable code.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Move some functions from dsputil. The idea is that videodsp contains
functions that are useful for a large and varied set of video decoders.
Currently, it contains emulated_edge_mc() and prefetch().
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The YUV channels of VP6 are encoded in a highly linear fashion which does
not have any slice-like concept to thread. The alpha channel of VP6A is
fairly independent of the YUV and comprises 40% of the work. This patch
uses the THREAD_SLICE capability to split the YUV and A decodes into
separate threads.
Two bugs are fixed by splitting YUV and alpha state:
- qscale_table from VP6A decode was for alpha channel instead of YUV
- alpha channel filtering settings were overwritten by YUV header parse
Signed-off-by: Ben Jackson <ben@ben.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Makes golden_frame more like other frame data, paves way for threading
alpha channel decode.
Signed-off-by: Ben Jackson <ben@ben.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, use it on the first member, since by definition, if
any member is aligned, the whole struct must be, in order to
maintain that alignment.
Fixes compilation with some finicky compilers, like a mix of libclang/msvc
Idea for fix from Måns Rullgård.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, use it on the first member, since by definition, if
any member is aligned, the whole struct must be, in order to
maintain that alignment.
Fixes compilation with some finicky compilers.
Idea for fix from Måns Rullgård.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
This moves all VP3-specific function pointers from dsputil to a
new vp3dsp context. There is no reason to ever use the VP3 IDCT
where an MPEG2 IDCT is expected or vice versa.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Grab from the bitstream in 16-bit chunks instead of 8-bit chunks.
TODO: grab in 32-bit chunks on 64-bit systems.
Originally committed as revision 24783 to svn://svn.ffmpeg.org/ffmpeg/trunk
Create a custom table for VP5/6/8's renorm to avoid depending on H.264's.
Saves one instruction in the arithmetic decoder as well.
Originally committed as revision 24701 to svn://svn.ffmpeg.org/ffmpeg/trunk
Always inline the arithmetic coder, except in the case of header-parsing stuff,
in which case don't inline it at all to save code size.
Originally committed as revision 24677 to svn://svn.ffmpeg.org/ffmpeg/trunk
This is a lot more reliable to get cmov rather than trying to trick gcc into
generating it, useful since it's 2% faster overall.
Patch by Eli Friedman <eli.friedman at gmail>
Originally committed as revision 24471 to svn://svn.ffmpeg.org/ffmpeg/trunk
on the huffman tree, instead of traversing the tree in a while loop.
Based on the similar optimization in libvpx's detokenize.c
10% faster at normal bitrates, and 30% faster for high-bitrate intra-only
Originally committed as revision 24468 to svn://svn.ffmpeg.org/ffmpeg/trunk
No difference at the moment, but allows a future branchy variant
of vp56_rac_get_prob to be significantly faster
Originally committed as revision 24467 to svn://svn.ffmpeg.org/ffmpeg/trunk
Saves nothing except a bit of memory/cache now, but will allow future
optimizations.
Originally committed as revision 24411 to svn://svn.ffmpeg.org/ffmpeg/trunk
Necessary because of this GCC bug:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44474
To do this, convert some, but not all (!) of the variables in VP56RangeCoder
into local variables.
If we convert c->high into a local variable, gcc gets the stupids and refuses
to use a conditional move for the unpredictable main branch.
TODO: dispense with this bullshit and write an asm version.
Originally committed as revision 23924 to svn://svn.ffmpeg.org/ffmpeg/trunk
This incantation causes gcc 4.3 to generate cmov on x86, a vastly better option
than a completely unpredictable branch.
Hopefully this carries over to newer versions and other CPUs with conditionals.
~5 cycles saved per call on a Core i7.
Originally committed as revision 23921 to svn://svn.ffmpeg.org/ffmpeg/trunk