For non-PCM audio, a Smacker frame contains the size of the decoded
audio in the first four bytes of the audio packet data; for PCM data,
said information would be redundant and according to [1] this field does
not exist. Therefore this commit sets the duration and timestamps
properly for PCM audio.
[1]: https://wiki.multimedia.cx/index.php/Smacker#Audio_Track_Chunk
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Add .read_seek function to the smacker demuxer for the special case of
seeking to ts=0. This is useful because smacker – like bink, with a
similar implementation – was mostly used to encode clips in video
games, where random seeks are rare but looping media are common.
Signed-off-by: Timotej Lazar <timotej.lazar@araneo.si>
Instead use ffio_read_size to read data into a buffer. Also check that
the desired size was actually successfully read and combine the check
with the check for reading the extradata.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Allocating two arrays with the same number of elements together
simplifies freeing them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
A Smacker file can contain up to seven audio tracks. Up until now,
the pts for the i. audio packet contained in a Smacker frame was
simply the end timestamp of the last i. audio packet contained in
an earlier Smacker frame.
The problem with this is that a Smacker stream need not contain data in
every Smacker frame and so the current i. audio packet present may come
from a different underlying stream than the last i. audio packet
contained in an earlier frame.
The sample hypnotix.smk* exhibits this. It has three audio tracks and
the first of the three has a longer first packet, so that the audio for
the first track is contained in only 235 packets contained in the first
235 Smacker frames; the end timestamp of this track is 166696 (about 7.56s
at a timebase of 1/22050); the other two audio tracks both have 253 packets
contained in the first 253 Smacker frames. Up until now, the 236th
packet of the second track being the first audio packet in the 236th
Smacker frame would get the end timestamp of the last first audio packet
from the last Smacker frame containing a first audio packet and said
last audio packet is the first audio packet from the 235th Smacker frame
from the first audio track, so that the timestamp is 166696. In contrast,
the 236th packet from the third track (whose packets contain the same number
of samples as the packets from the second track) has a timestamp of
156116 (because its timestamp is derived from the end timestamp of the
235th packet of the second audio track). In the end, the second track
ended up being 177360/22050 s = 8.044s long; in contrast, the third
track was 166780/22050 s = 7.56s long which also coincided with the
video.
This commit fixes this by not using timestamps from other tracks for
a packet's pts.
*: https://samples.ffmpeg.org/game-formats/smacker/wetlands/hypnotix.smk
Reviewed-by: Timotej Lazar <timotej.lazar@araneo.si>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The layout of a Smacker frame is as follows: For some frames, the
beginning of the frame contained a palette for the video stream; then
there are potentially several audio frames, followed by the data for the
video stream.
The Smacker demuxer used to read the palette, then cache every audio frame
into a buffer (that gets reallocated to the desired size every time a
frame is read into this buffer), then read and return the video frame
(together with the palette). The cached audio frames are then returned
by copying the data into freshly allocated buffers; if there are none
left, the next frame is read.
This commit changes this: At the beginning of a frame, the palette is
read and cached as now. But audio frames are no longer cached at all;
they are returned immediately. This gets rid of copying and also allows
to remove the code for the buffer-to-AVStream correspondence.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The first four bytes of smacker audio are supposed to contain the number
of samples, so treat audio frames smaller than that as invalid.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
When reading a new frame, the Smacker demuxer seeks to the next frame
position where it excepts the next frame; then it (potentially) reads
the palette, the audio packets associated with the frame and finally the
actual video frame. It is only at the end that the frame counter as well
as the position where the next frame is expected get updated.
This has a downside: If e.g. invalid data is encountered when reading
the palette, the demuxer returns immediately (with an error) and if the
caller calls av_read_frame again, the demuxer seeks to the position where
it already was, reads the very same palette data again and therefore will
return an error again. If the caller calls av_read_frame repeatedly
(say, until a packet is received or until EOF), this meight become an
infinite loop.
This could also happen if e.g. the size of one of the audio frames was
invalid or if the frame size was gigantic.
This commit changes this by skipping a frame if it turns out to be
invalid or an error happens otherwise. This ensures that EOF will be
returned eventually in the above scenario.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The Smacker demuxer buffers audio packets before it outputs them, but it
increments the counter of buffered packets prematurely: If allocating
the audio buffer fails, an error (most likely AVERROR(ENOMEM)) is returned.
If the caller decides to call av_read_frame() again, the next call will
take the codepath for returning already buffered audio packets and it
will fail (because the buffer that ought to be allocated isn't) without
decrementing the number of supposedly buffered audio packets (it doesn't
matter whether there would be enough memory available in subsequent calls).
Depending on the caller's behaviour this is potentially an infinite loop.
This commit fixes this by only incrementing the number of buffered audio
packets after having successfully read them and unconditionally reducing
said number when outputting one of them. It also changes the semantics
of the curstream variable: It is now the number of currently buffered
audio packets whereas it used to be the index of the last audio stream
to be read. (Index refers to the index in the array of buffers, not to
the stream index.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This is mainly about improving legibility of the code and getting rid of
overlong lines by using variables for st->codecpar instead of accessing
the codecparameters via st->codecpar->.
Also, some code has been moved to better fitting places.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This commit removes data that is only used during smacker_read_header()
from the demuxer's context and replaces the data that is used by local
variables. The other data is completely dropped.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The Smacker demuxer currently parses several fields that indicate
how many audio streams a file contains. This data is parsed and stored
into arrays in the demuxer's context and although the data is used only
to initialize the audio streams, it is kept for the whole lifetime of
the demuxer.
This has been changed: The data is used directly to create
the audio streams and no longer kept at all.
This also simplifies error handling in case adding a new stream fails:
Several arrays which until now have been allocated between parsing the
data determining how many audio streams to create and actually creating
them would need to be freed in this case. Now the streams are created
first, so freeing is no longer an issue.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The Smacker demuxer reads four consecutive 32bit values from the file
header into its demux context (as four uint32_t), converting it to
native endianness in the process and then writing these four values
later (after extradata has been allocated) to extradata as four 32bit
values (converting to little endian in the process).
This commit changes this: The stream and the extradata are allocated
earlier, so that the data destined for extradata can be read directly
into extradata.
Furthermore, given that these values are not needed for demuxing itself
they are now no longer kept as part of the demuxing context.
Finally, a check regarding the number of frames has been moved up,
too, in order to exit early before unnecessarily allocating the
stream and the extradata.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
It is not uncommon to find code where the caller thinks to know better
what the return value should be than the callee. E.g. something like
"if (av_new_packet(pkt, size) < 0) return AVERROR(ENOMEM);". This commit
changes several instances of this to instead forward the actual error.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Currently, AVStream contains an embedded AVCodecContext instance, which
is used by demuxers to export stream parameters to the caller and by
muxers to receive stream parameters from the caller. It is also used
internally as the codec context that is passed to parsers.
In addition, it is also widely used by the callers as the decoding (when
demuxer) or encoding (when muxing) context, though this has been
officially discouraged since Libav 11.
There are multiple important problems with this approach:
- the fields in AVCodecContext are in general one of
* stream parameters
* codec options
* codec state
However, it's not clear which ones are which. It is consequently
unclear which fields are a demuxer allowed to set or a muxer allowed to
read. This leads to erratic behaviour depending on whether decoding or
encoding is being performed or not (and whether it uses the AVStream
embedded codec context).
- various synchronization issues arising from the fact that the same
context is used by several different APIs (muxers/demuxers,
parsers, bitstream filters and encoders/decoders) simultaneously, with
there being no clear rules for who can modify what and the different
processes being typically delayed with respect to each other.
- avformat_find_stream_info() making it necessary to support opening
and closing a single codec context multiple times, thus
complicating the semantics of freeing various allocated objects in the
codec context.
Those problems are resolved by replacing the AVStream embedded codec
context with a newly added AVCodecParameters instance, which stores only
the stream parameters exported by the demuxers or read by the muxers.
When av_reallocp fails, the associated variables that keep track of
the number of elements in the array (and in some cases, the
separate number of allocated elements) need to be reset.
Not all of these might technically be needed, but it's better to
reset them if in doubt, to make sure variables don't end up
conflicting.
Signed-off-by: Martin Storsjö <martin@martin.st>