Now memcpy can be avoided for NAL units containing escapes, too.
Particularly improves performance for files with hardcoded black bars.
For such a file, time spent in cbs_h2645_split_fragment went down from
369410 decicycles to 327677 decicycles. (It were 379114 decicycles when
every NAL unit was copied.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com>
This is in preparation for a patch for cbs_h2645. Now the packet's
rbsp_buffer can be owned by an AVBuffer.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com>
Now memcpy is avoided for NAL units that don't contain 0x03 escape
characters.
Improves performance of cbs_h2645_fragment_add_nals from 36940
decicycles to 6364 decicycles based on 8 runs with a 5.1 Mb/s H.264
sample (262144 runs each).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Supports both prefix and suffix SEI, decoding all of the common SEI
types and some more obscure ones. Most of this is tested by the
existing tests in fate.
Instead of using a combination of bitreader and -writer for copying data,
one can byte-align the (obsolete and removed) bitreader to improve performance.
With the right alignment one can even use memcpy. The right alignment
normally exists for CABAC and hence for H.265 in general.
For aligned data this reduced the time to copy the slicedata from
776520 decicycles to 33889 with 262144 runs and a 6.5mb/s H.264 video.
For unaligned data the number went down from 279196 to 97739 decicycles.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
64c50c0e97 declared support for decomposing
them but omitted to implement it; this adds an implementation.
Also do the same for end-of-stream NAL units, since they are equivalent.
Similar to H264, cbs_h265_{read, write}_nal_unit() can handle HEVC
prefix SEI NAL units. Currently mastering display colour volume SEI
message is added only, we may add more SEI message if needed later
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Removes unnecessary data copies, and partially fixes potential issues
with dangling references held in said lists.
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: James Almer <jamrial@gmail.com>
This saves one malloc + memcpy per packet
The CodedBitstreamFragment buffer is padded to follow the requirements
of AVPacket.
Reviewed-by: jkqxz
Signed-off-by: James Almer <jamrial@gmail.com>
This makes it easier for users of the CBS API to get alloc/free right -
all subelements use the buffer API so that it's clear how to free them.
It also allows eliding some redundant copies: the packet -> fragment copy
disappears after this change if the input packet is refcounted, and more
codec-specific cases are now possible (but not included in this patch).