Improves readability and avoids a redundant index variable
that was mistakenly called "tracksize".
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The current checks just check whether the boxes fit into the remaining
size of the packet instead of whether they actually fit into the box
size. This has been changed; part of this change is to pass the size of
the box (minus the box header) as parameter instead of a pointer to
the AVPacket by which the box parsing function is supposed to
recalculate whether enough data is available.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The base size of a box refers to the size the box has in a file,
not in memory; so size_t is not their natural type. Therefore use
a plain unsigned which is smaller on 64bit systems and still big
enough to represent any conceivable base size.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are three types of style entries which are redundant:
a) Entries with length zero. They are already discarded.
b) Entries that are equivalent to the default style:
They can be safely discarded.
c) Entries that are equivalent to the immediately preceding style
if the start of the current style coincides with the end of the
preceding style. In this case the styles can be merged.
This commit implements discarding/merging in cases b) and c).
This fixes ticket #9548. In said ticket each packet contained
exactly one style entry that covered the complete packet with
the exception of the last character (probably created by a tool
that didn't know that the style's end is exclusive). Said style
coincided with the default style, leading to a superfluous reset,
which is now gone.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Both TextSampleEntry and TextSample can contain StyleRecords;
yet both the code as well as the structures for them were duplicated.
This commit changes this.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Giving elements of a structure called StyleBox names like
"style_start" or "style_end" is redundant, especially given
that the relevant variables are also called style.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_ass_subtitle_header_full() just uses av_asprintf() and is therefore
thread-safe itself.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Given that the AVCodec.next pointer has now been removed, most of the
AVCodecs are not modified at all any more and can therefore be made
const (as this patch does); the only exceptions are the very few codecs
for external libraries that have a init_static_data callback.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
style_active doesn't do anything any more: It is already assured
that style_active is one when one reaches the end of a style.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The checks for whether a style should be opened/closed at the current
character position are as follows: A variable entry contained the index
of the currently active or potentially next active style. If the current
character position coincided with the start of style[entry], the style
was activated; this was followed by a check whether the current
character position coincided with the end of style[entry]; if so, the
style was deactivated and entry incremented. Afterwards the char was
processed.
The order of the checks leads to problems in case the endChar of style A
coincides with the startChar of the next style (say B): Style B was never
opened. When we are at said common position, the currently active style
is A and so the start pos check does not succeed; but the end pos check
does and it closes the currently active style A and increments entry.
At the next iteration of the loop, the current character position is
bigger than the start position of style B (which is style[entry]) and
therefore the style is not activated.
The solution is of course to first check for whether a style needs to be
closed (and increment entry if it does) before checking whether the next
style needs to be opened.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
They would either lead to unnecessary ASS tags being emitted (namely
tags that are reset immediately thereafter) or would lead to problems
when parsing: e.g. if a zero-length style immediately follows another
style, the current code will end the preceding style and set the
zero-length style as the next potentially active style, but it is only
tested for activation when the next character is parsed at which point
the current offset is already greater than both the starting as well
as the end offset of the empty style. It will therefore neither be
opened nor closed and all subsequent styles will be ignored.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, the 3GPP Timed Text decoder used av_dynarray_add()
for a list of style entries. Said entries are individually allocated
and owned by the pointers in the dynamic array and are therefore
unsuitable for av_dynarray_add() which simply frees the array,
but not the entries on error. In this case the intended new entry
also leaks because it has been forgotten to free it.
This commit fixes this. It is now allocated in one go and not
reallocated multiple times (and it won't be overallocated any more).
After all, the final number of elements (pending errors) is already
known in advance.
Furthermore, the style entries are now the entries of the new array,
i.e. they are no longer allocated separately. This also removes one
level of indirection.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
There is no need to walk through the list of fonts twice.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Every font entry occupies at least three bytes, so checking early
whether there is that much data available is a low-effort way to exclude
invalid extradata. Doing so leads to an overall simplification.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Up until now, the 3GPP Timed Text decoder used av_dynarray_add()
for a list of font entries, a structure which contains an allocated
string. The font entries are owned by the pointers in the dynamic array
and are therefore unsuitable for av_dynarray_add() which simply frees
the array, but not the font entries and of course not the strings. The
latter all leak if reallocating the dynamic array fails.
This commit fixes this. It stops reallocating the array altogether:
After all, the final number of elements (pending errors) is already
known in advance.
Furthermore, the font entries are now the entries of the new array,
i.e. the font entries are no longer allocated separately. This also
removes one level of indirection.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
If allocating fonts fails when reading the header, all fonts are freed,
yet the counter of fonts is not reset and no error is returned; when
subtitles are decoded lateron, the inexistent list of fonts is searched
for the matching font for this particular entry which of course leads to
a segfault.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Fixes: left shift of 243 by 24 places cannot be represented in type 'int'
Fixes: 22716/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MOVTEXT_fuzzer-5704263425851392
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Font sizes are relative to the subtitle frame dimensions. If the
expected frame dimensions are not known, the font sizes will most
likely be incorrect.
Signed-off-by: Philip Langdale <philipl@overt.org>
Style flags were only being turned on. If the default was on and style
record turned off, style flag remained on.
Signed-off-by: Philip Langdale <philipl@overt.org>
It's not necessary to walk the style record list twice per subtitle
character. style records are in order and do not overlap.
Signed-off-by: Philip Langdale <philipl@overt.org>
Limits based on 3GPP TS 26.245 V14.0.0
Fixes: Timeout
Fixes: 6377/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MOVTEXT_fuzzer-5175929115508736
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Subtitles which contained styled UTF-8 subtitles (i.e. not just 7 bit
ASCII characters) were not handled correctly. The spec mandates that
styling start/end ranges are in "characters". It's not quite clear what
a "character" is supposed to be, but maybe they mean unicode codepoints.
FFmpeg's decoder treated the style ranges as byte idexes, which could
lead to UTF-8 sequences being broken, and the common code dropping the
whole subtitle line.
Change this and count the codepoint instead. This also means that even
if this is somehow wrong, the decoder won't break UTF-8 sequences
anymore. The sample which led me to investigate this now appears to work
correctly.