When turned on, H264/CAVLC gets ~15% (CVPCMNL1_SVA_C.264) slower for
ultra-high-bitrate files, or ~2.5% (CVFI1_SVA_C.264) for lower-bitrate
files. Other codecs are affected to a lesser extent because they are
less optimized; e.g., VC-1 slows down by less than 1% (all on x86).
The patch generated 3 extra instructions (cmp, cmovae and mov) per
call to get_bits().
The performance penalty on ARM is within the error margin for most
files, up to 4% in extreme cases such as CVPCMNL1_SVA_C.264.
Based on work (for GCI) by Aneesh Dogra <lionaneesh@gmail.com>, and
inspired by patch in Chromium by Chris Evans <cevans@chromium.org>.
The A32 bitstream reader variant is only used on ARMv5 and for
Prores due to the larger bit cache this decoder requires.
In benchmarks on ARMv5 (Marvell Sheeva) with gcc 4.6, the only
statistically significant difference between ALT and A32 is
a 4% advantage for ALT in FLAC decoding. There is thus no (longer)
any reason to keep the A32 reader from this point of view.
This patch adds an option to the ALT reader increasing the bit
cache to 32 bits as required by the Prores decoder. Benchmarking
shows no significant change in speed on Intel i7. Again, the
A32 reader fails to justify its existence.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Some of the macros in get_bits.h include a final semicolon,
some do not. This removes these or adds do {} while(0) around
the macros as appropriate and adds semicolons where needed in
calling code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Using the libmpeg2 reader causes errors in a multitude of places,
including MPEG and H264 codecs. As the advantage of this reader
is questionable, removing it seems the sensible course of action,
especially considering the simplifications this allows elsewhere
with the bit cache size increasing from 17 to 25 bits as minimum.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Other parts of FFmpeg use NE (native endian) rather than ME (machine).
This makes it consistent.
Originally committed as revision 24169 to svn://svn.ffmpeg.org/ffmpeg/trunk
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.
Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
The limit for get/show_bits_long() to use get/show_bits() directly
should be MIN_CACHE_BITS, not 17.
Originally committed as revision 21951 to svn://svn.ffmpeg.org/ffmpeg/trunk
Any tips on how i can convince gcc that it doesnt need a
mov %eax, %eax
in every get_bits() ?
Originally committed as revision 21433 to svn://svn.ffmpeg.org/ffmpeg/trunk