Up until now, when an element was skipped, it was relied upon
ffio_limit to make sure that there is enough data available to skip.
ffio_limit itself relies upon the availability of the file's size. As
this needn't be available, the check has been refined: First one byte
less than intended is skipped, then another byte is read, followed by a
check of the error flags.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This commit fixes a number of bugs:
1. There was no check that no read error/EOF occured during
ebml_read_uint, ebml_read_sint and ebml_read_float.
2. ebml_read_ascii and ebml_read_binary did sometimes not forward
error codes; instead they simply returned AVERROR(EIO).
3. In particular, AVERROR_EOF hasn't been used and no dedicated error
message for it existed. This has been changed.
In order to reduce code duplication, the new error code NEEDS_CHECKING
has been introduced which makes ebml_parse check the AVIOContext's
status for errors.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
ebml_read_num had a number of flaws:
1. The check for read errors/EOF was totally wrong. E.g. an EBML number
beginning with the invalid 0x00 would be considered a read error,
although it is just invalid data.
2. The check for read errors/EOF was done just once, after reading the
first byte of the EBML number. But errors/EOF can happen inbetween, of
course, and this wasn't checked.
3. There was no way to distinguish when EOF should be an error (because
the data has to be there) for which an error message should be emitted
and when it is not necessarily an error (namely during parsing of EBML
IDs). Such a possibility has been added and used.
All this was fixed; furthermore, the error messages for invalid EBML
numbers were improved and useless initializations were removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, webm_dash_manifest_cues used the return values of
ebml_read_num and ebml_read_length without checking for errors,
i.e. return values < 0. This has been changed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
It is only necessary to zero the initial allocated memory used to store
the size of laced frames if the block used Xiph lacing. Otherwise no
unintialized data was ever used, so use av_malloc instead of av_mallocz.
Also use the correct type for the allocations.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, the SimpleBlock was treated specially: It basically had
its own EBML category and it was also included in the BlockGroup EBML
syntax (although a SimpleBlock must not exist in a BlockGroup according
to the Matroska specifications). The latter fact also meant that
a MatroskaBlock's buffer was always unreferenced twice.
This has been changed: The type of a SimpleBlock is now an EBML_BIN.
The only way in which SimpleBlocks are still different is that they
share their associated structure with another unit (namely BlockGroup).
This is also used to unref the block: It is always unreferenced via the
BlockGroup syntax.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Before this commit, the Matroska muxer would read a block when required
to do so, parse the block, create and return the necessary AVPackets and
yet keep the blocks (in a dynamically allocated list), although they
aren't used at all any more. This has been changed. There is no list any
more and the block is immediately discarded after parsing.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Every new element of an EbmlList is zeroed initially in
ebml_parse_elem, so that in particular a SimpleBlock's duration is
initialized to zero. Therefore it is unnecessary to initialize this
field again (for SimpleBlocks) in matroska_parse_cluster_incremental.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
By default, the data_offset member of the AVFormatInternal of the
AVFormatContext associated with the MatroskaDemuxContext has not been
initialized explicitly by any Matroska-specific function, so that it was
initialized by default to the offset at the end of matroska_read_header,
i.e. usually to the offset of the length field of the first encountered
cluster. This meant that in case that the Matroska-specific seek-code
fails because there are no index entries for the target track a seek to
data_offset would be performed and ordinary parsing would start from
there which is nonsense: The length field would be treated as EBML ID and
(if the length field is not longer than four bytes (EBML numbers that
long are rejected as invalid EBML IDs)) whatever comes next would be
treated as its EBML size although it simply isn't.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The earlier code relied on the length of clusters always being coded on
eight bytes as was the behaviour of libavformat's Matroska muxer until
recently. But given that our own Matroska muxer now (and mkvmerge from
time immemorial) creates files that don't conform to this assumption,
it is high time to get rid of this assumption.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When the new incremental parser was introduced, the old parser was
kept, because the new parser was unable to handle the way SSA packets
are put into Matroska. But since 2014 (since c7d8dbad) this is no
longer needed, so that the old parser can be completely removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This commit replaces copying attached pictures by using references to
the already existing buffers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Matroska EBML IDs can be only four bytes long maximally, so it is
natural to use uint32_t for them. By doing this and rearranging the
elements of the MatroskaLevel1Element structure, one can reduce the size
of said structure.
Notice that this field is not read via the generic reading process for
EBML_UINT, so one is not forced to use an uint64_t for it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This error message is outdated since d31fb1a9.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The earlier code had three flaws:
1. The case of an unknown-sized element inside a finite-sized element
(which is against the specifications) was not caught.
2. The error message wasn't helpful: It compared the length of the child
with the offset of the end of the parent and claimed that the first
exceeds the latter, although that is not necessarily true.
3. Unknown-sized elements that are not parsed can't be skipped. Given
that according to the Matroska specifications only the segment and the
clusters can be of unknown-size, this is handled by not allowing any
other units to have infinite size whereas the earlier code would seek
back by 1 byte upon encountering an infinite-size element that ought
to be skipped.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Unifying the way the EBML unknown length is signaled, rather than using two
incompatible values. UINT64_MAX cannot be read as a valid EBML length with the
current code.
Co-authored-by: Steve Lhomme <robux4@ycbcr.xyz>
Co-authored-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This was found through the Hacker One program on VLC but is not a security issue in libavformat
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Instead add the character to the snprintf above as suggested by Mark.
Silences a warning:
libavformat/matroskadec.c: In function 'webm_dash_manifest_cues':
libavformat/matroskadec.c:3947:13: warning: 'strncat' specified bound 1 equals source length [-Wstringop-overflow=]
strncat(buf, ",", 1);
^~~~~~~~~~~~~~~~~~~~
This will get ISOBMFF and Matroska up to date with the revised AV1 Codec
Configuration Box spec.
For now keep propagating raw OBUs as extradata until all libavcodec modules
are adapted to handle AV1CodecConfigurationRecord formatted extradata.
Tested-by: Thomas Daede <bztdlinux@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Newly allocated data buffers (wavpack, prores, compressed buffers)
are padded to meet the requirements of AVPacket.
About 10x speed up in matroska_parse_frame().
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
Simplifies code in matroska_parse_frame(). This is in preparation for
the following patch.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
Data in EbmlBin objects is never changed after being read from the
input file (save for two specific cases with encoded CodePrivate), so
using AVBufferRef we can prevent unnecessary copy of data by instead
creating new references to said constant data.
Signed-off-by: James Almer <jamrial@gmail.com>
Defined in a recent revision of https://www.webmproject.org/docs/container/
This prevents storing the contents of CodecPrivate into extradata for
a codec that doesn't need nor expect any. It will among other things
prevent matroska specific binary data from being dumped onto other
formats during remuxing.
Signed-off-by: James Almer <jamrial@gmail.com>
There's at least one known file with a TrueHD stream that hasn't
been correctly muxed, and requires full frame parsing and repack.
Signed-off-by: James Almer <jamrial@gmail.com>
track->video.projection.type is set to 0 (a Matroska specific "No spherical
metadata present" value, with no related AVSphericalMapping) by default on
files without the element.
This removes bogus warnings on every single matroska file without Spherical
metadata.
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes Coverity CID: 1405453
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
The WebM DASH spec states:
The Initialization Segment shall not contain Clusters or Cues.
The Segment Index corresponds to the Cues.
Previously, it included the cues if they were at the front.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Output was apparently not tested for correctness. Passing overlapping
memory to snprintf causes undefined behavior, and usually resulted in
only the very last timestamp being written to metadata, and not a list
at all.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Add an option to webm_dash_manifest demuxer to specify a value for
"bandwidth" field in the DASH manifest. The value is then used by
the muxer. Fixes an existing FIXME in the code.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: James Zern <jzern@google.com>
These values are defined to be 32bit in the specification,
so it makes more sense to store them as fixed width.
Based on a patch by Micahel Niedermayer <michael@niedermayer.cc>.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
These values are defined to be 32bit in the specification,
so it makes more sense to store them as fixed width.
Based on a patch by Micahel Niedermayer <michael@niedermayer.cc>.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>