Change BitBuf into uint64_t on 64-bit x86. This means we need to flush the
buffer less often, which is a significant speed win. All other platforms,
including all 32-bit ones, are unchanged. Output bitstream is the same.
All API constraints are kept in place, e.g., you still cannot put_bits()
more than 31 bits at a time. This is so that codecs cannot accidentally
become 64-bit-only or similar.
Benchmarking on transcoding to various formats shows consistently
positive results:
dnxhd 25.60 fps -> 26.26 fps ( +2.6%)
dvvideo 24.88 fps -> 25.17 fps ( +1.2%)
ffv1 14.32 fps -> 14.58 fps ( +1.8%)
huffyuv 58.75 fps -> 63.27 fps ( +7.7%)
jpegls 6.22 fps -> 6.34 fps ( +1.8%)
magicyuv 57.10 fps -> 63.29 fps (+10.8%)
mjpeg 48.65 fps -> 49.01 fps ( +0.7%)
mpeg1video 76.41 fps -> 77.01 fps ( +0.8%)
mpeg2video 75.99 fps -> 77.43 fps ( +1.9%)
mpeg4 80.66 fps -> 81.37 fps ( +0.9%)
prores 12.35 fps -> 12.88 fps ( +4.3%)
prores_ks 16.20 fps -> 16.80 fps ( +3.7%)
rv20 62.80 fps -> 62.99 fps ( +0.3%)
utvideo 68.41 fps -> 76.32 fps (+11.6%)
Note that this includes video decoding and all other encoding work,
such as DCTs. If you isolate the actual bit-writing routines, it is
likely to be much more.
Benchmark details: Transcoding the first 30 seconds of Big Buck Bunny
in 1080p, Haswell 2.1 GHz, GCC 8.3, generally quantizer locked to
5.0. (Exceptions: DNxHD needs fixed bitrate, and JPEG-LS is so slow
that I only took the first 10 seconds, not 30.) All runs were done
ten times and single-threaded, top and bottom two results discarded to
get rid of outliers, arithmetic mean between the remaining six.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This should never be untrue, if it is, thats a bug
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Libav, for some reason, merged this as a public API function. This will
aid in future merges.
A define is left for backwards compat, just in case some person
used it, since it is in a public header.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
This parameter can be used to inform the allocation code about how much
downsizing might occur, and can be used to optimize how to allocate the
packet
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The rationale is that coded_frame was only used to communicate key_frame,
pict_type and quality to the caller, as well as a few other random fields,
in a non predictable, let alone consistent way.
There was agreement that there was no use case for coded_frame, as it is
a full-sized AVFrame container used for just 2-3 int-sized properties,
which shouldn't even belong into the AVCodecContext in the first place.
The appropriate AVPacket flag can be used instead of key_frame, while
quality is exported with the new AVPacketSideData quality factor.
There is no replacement for the other fields as they were unreliable,
mishandled or just not used at all.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Allocating coded_frame is what most encoders do anyway, so it makes
sense to always allocate and free it in a single place. Moreover a lot
of encoders freed the frame with av_freep() instead of the correct API
av_frame_free().
This bring uniformity to encoder behaviour and prevents applications
from erroneusly accessing this field when not allocated. Additionally
this helps isolating encoders that export information with coded_frame,
and heavily simplifies its deprecation.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>