The HEIF specification permits specifying the looping behavior of
animated sequences by using the EditList (elst) box. The track
duration will be set to the total duration of all the loops (or
infinite) and the duration of a single loop will be set in the edit
list box.
The default behavior is to loop infinitely.
Compliance verification:
* This was added in libavif recently [1] and the files produced by
ffmpeg after this change have EditList boxes similar to the ones
produced by libavif (and avifdec is able to parse the loop count as
intended).
* ComplianceWarden is ok with the produced files.
* Chrome is able to play back the produced files.
[1] 4d2776a3af
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The packets given to muxers need not be writable,
so it is best to access them via const uint8_t*.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVIF specification allows for alpha channel as an auxiliary item (in
case of still images) or as an auxiliary track (in case of animated
images). Add support for both of these. The AVIF muxer will take
exactly two streams (when alpha is present) as input (first one being
the YUV planes and the second one being the alpha plane).
The input has to come from two different images (one of it color and
the other one being alpha), or it can come from a single file
source with the alpha channel extracted using the "alphaextract"
filter.
Example using alphaextract:
ffmpeg -i rgba.png -filter_complex "[0:v]alphaextract[a]" -map 0 -map "[a]" -still-picture 1 avif_with_alpha.avif
Example using two sources (first source can be in any pixel format and
the second source has to be in monochrome grey pixel format):
ffmpeg -i color.avif -i grey.avif -map 0 -map 1 -c copy avif_with_alpha.avif
The generated files pass the compliance checks in Compliance Warden:
https://github.com/gpac/ComplianceWarden
libavif (the reference avif library) is able to decode the files
generated using this patch.
They also play back properly (with transparent background) in:
1) Chrome
2) Firefox (only still AVIF, no animation support)
Reviewed-by: James Zern <jzern@google.com>
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Add an AVIF muxer by re-using the existing the mov/mp4 muxer.
AVIF Specification: https://aomediacodec.github.io/av1-avif
Sample usage for still image:
ffmpeg -i image.png -c:v libaom-av1 -still-picture 1 image.avif
Sample usage for animated AVIF image:
ffmpeg -i video.mp4 animated.avif
We can re-use any of the AV1 encoding options that will make
sense for image encoding (like bitrate, tiles, encoding speed,
etc).
The files generated by this muxer has been verified to be valid
AVIF files by the following:
1) Displays on Chrome (both still and animated images).
2) Displays on Firefox (only still images, firefox does not support
animated AVIF yet).
3) Verified to be valid by Compliance Warden:
https://github.com/gpac/ComplianceWarden
Fixes the encoder/muxer part of Trac Ticket #7621
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
supports forcing or disabling the writing of the btrt atom.
the default behavior is to write the atom only for mp4 mode.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
On output streams where a multichannel stream needs to be stored as one track
per channel, each track will have a channel layout describing the position of
the channel they contain. For the track with front center, the mov muxer was
using the mov layout "mono" instead of the label for the front center position.
Since our channel layout API considers front center == mono, we need to do some
heuristics. To achieve this, we make sure all audio tracks contain streams with
a single channel, and only one of them is front center. In that case, we write
the front center label instead of signaling mono layout.
Fixes the last part of ticket #2865
Signed-off-by: James Almer <jamrial@gmail.com>
Up until now, we had a PacketList structure which is actually
a PacketListEntry; a proper PacketList did not exist
and all the related functions just passed pointers to pointers
to the head and tail elements around. All these pointers were
actually consecutive elements of their containing structs,
i.e. the users already treated them as if they were a struct.
So add a proper PacketList struct and rename the current PacketList
to PacketListEntry; also make the functions use this structure
instead of the pair of pointers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Includes basic support for both the ISMV ('dfxp') and MP4 ('stpp')
methods. This initial version also foregoes fragmentation support
in case the built-in sample squashing is to be utilized, as this
eases the initial review.
Additionally, add basic tests for both muxing modes in MP4.
Signed-off-by: Jan Ekström <jan.ekstrom@24i.com>
"frag_start" is redundant, and every occurance can be replaced with cluster[0].dts - start_dts
The proof of no behaviour changes: (All line number below is based on commit bff7d662d7)
"frag_start" is read at 4 place (with all possible call stacks):
mov_write_packet
...
mov_flush_fragment
mov_write_moof_tag
mov_write_moof_tag_internal
mov_write_traf_tag
mov_write_tfxd_tag (#1)
mov_write_tfdt_tag (#2)
mov_add_tfra_entries (#3)
mov_write_sidx_tags
mov_write_sidx_tag (#4)
mov_write_trailer
mov_auto_flush_fragment
mov_flush_fragment
... (#1#2#3#4)
mov_write_sidx_tags
mov_write_sidx_tag (#4)
shift_data
compute_sidx_size
get_sidx_size
mov_write_sidx_tags
mov_write_sidx_tag (#4)
All read happens in "mov_write_trailer" and "mov_write_moof_tag". So we need to prove no behaviour change in these two
functions.
Condition 1: for every track that have "trk->entry == 0", trk->frag_start == trk->track_duration.
Condition 2: for every track that have "trk->entry > 0", trk->frag_start == trk->cluster[0].dts - trk->start_dts.
Definition 1: "Before flush" means just before the invocation of "mov_flush_fragment", except for the auto-flush case in
"mov_write_single_packet", which means before L5934.
Lemma 1: If Condition 1 & 2 is true before flush, Condition 1 & 2 is still true after "mov_flush_fragment" returns.
Proof:
No update to the tracks that have "trk->entry == 0" before flushing, so we only consider tracks that have "trk->entry > 0":
Case 1: !moov_written and moov will be written in this iteration
trk->entry = 0 L5366
trk->frag_start == trk->cluster[0].dts - trk->start_dts Lemma condition
trk->frag_start += trk->start_dts + trk->track_duration - trk->cluster[0].dts; L5363
So trk->entry == 0 && trk->frag_start == trk->track_duration
Case 2: !moov_written and moov will NOT be written in this iteration
nothing changed
Case 3: moov_written
trk->entry = 0 L5445
trk->frag_start == trk->cluster[0].dts - trk->start_dts Lemma condition
trk->frag_start += trk->start_dts + trk->track_duration - trk->cluster[0].dts; L5444
So trk->entry == 0 && trk->frag_start == trk->track_duration
Note that trk->track_duration may be updated for the tracks that have "trk->entry > 0" (mov_write_moov_tag will
update track_duration of "tmcd" track, but it must have 1 entry). But in all case, trk->frag_start is also updated
to consider the new value.
Lemma 2: If Condition 1 & 2 is true before "ff_mov_write_packet" invocation, Condition 1 & 2 is still true after it returns.
Proof:
Only the track corresponding to the pkt is updated, and no update to relevant variables if trk->entry > 0 before invocation.
So we only need to prove "trk->frag_start == trk->cluster[0].dts - trk->start_dts" after trk->entry increase from 0 to 1.
Case 1: trk->start_dts == AV_NOPTS_VALUE
Case 1.1: trk->frag_discont && use_editlist
trk->cluster[0].dts = pkt->dts at L5741
trk->frag_start = pkt->pts at L5785
trk->start_dts = pkt->dts - pkt->pts at L5786
So trk->frag_start == trk->cluster[0].dts - trk->start_dts
Case 1.2: trk->frag_discont && !use_editlist
trk->cluster[0].dts = pkt->dts at L5741
trk->frag_start = pkt->dts at L5790
trk->start_dts = 0 at L5791
So trk->frag_start == trk->cluster[0].dts - trk->start_dts
Case 1.3: !trk->frag_discont
trk->cluster[0].dts = pkt->dts at L5741
trk->frag_start = 0 init
trk->start_dts = pkt->dts at L5779
So trk->frag_start == trk->cluster[0].dts - trk->start_dts
Case 2: trk->start_dts != AV_NOPTS_VALUE
Case 2.1: trk->frag_discont
trk->cluster[0].dts = pkt->dts at L5741
trk->frag_start = pkt->dts - trk->start_dts at L5763
So trk->frag_start == trk->cluster[0].dts - trk->start_dts
Case 2.2: !trk->frag_discont
trk->cluster[0].dts = trk->start_dts + trk->track_duration at L5749
trk->track_duration == trk->frag_start Lemma condition
So trk->frag_start == trk->cluster[0].dts - trk->start_dts
Lemma 3: Condition 1 & 2 is true in all case before and after "ff_mov_write_packet" invocation, before flush and after
"mov_flush_fragment" returns.
Proof: All updates to relevant variable happen either in "ff_mov_write_packet", or during flush. And Condition 1 & 2
is true initially. So with lemma 1 & 2, we can prove this use induction.
Noticed that all read of "frag_start" only happen in "trk->entry > 0" branch. Now we need to prove Condition 2 is true
before each read.
Because no update to variables relevant to Condition 2 between "before flush" and "mov_write_moof_tag" invocation, we
can conclude Condition 2 is true before every invocation of "mov_write_moof_tag". No behaviour change in
"mov_write_moof_tag" is proved.
In "mov_write_trailer", No update to relevant variables after the last flush and before the invocation of
"mov_write_sidx_tag". So no behaviour change to "mov_write_trailer" is proved.
Q.E.D.
Signed-off-by: Hu Weiwen <sehuww@mail.scut.edu.cn>
Signed-off-by: Martin Storsjö <martin@martin.st>
There are cases where using 1000 as the MP4 timescale is not
accurate enough, for example when one needs sample-accurate audio
handling.
This adds a new AVOption to the MOV/MP4 muxer to override the
movie timescale, but it still defaults to 1000 to maintain current
default behavior.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
The clli atom isn't in ISO/IEC 14496-12:2015 so the flag is marked as
experimental and the clli atom is not written by default.
The clli atom is already parsed by FFmpeg in mov.c.
Signed-off-by: Michael Bradshaw <mjbshaw@google.com>
If 'write_colr' movflag is set, then movflag 'prefer_icc' can
be used to first look for an AV_PKT_DATA_ICC_PROFILE entry to
encode.
If ICC profile doesn't exist, default behaviour enabled by
'write_colr' occurs.
Signed-off-by: vectronic <hello.vectronic@gmail.com>
If not available, set flags to 24 (bits 4 and 5), to signal the wallclock value
is read at the time of writing the atom.
Signed-off-by: James Almer <jamrial@gmail.com>
The producer reference time box supplies relative wall-clock times
at which movie fragments, or files containing movie fragments
(such as segments) were produced.
The box is mainly useful in live streaming use cases. A media player
can parse the box and utilize the time fields to measure and improve
the latency during real time playout.
Fixes https://trac.ffmpeg.org/ticket/2798
This makes movenc handle AV_DISPOSITION_ATTACHED_PIC and write
the associated pictures in iTunes cover atom. This corresponds
to how 'mov' demuxer parses and exposes the cover images when
reading.
Most of the existing track handling loops properly ignore
these 'virtual streams' as MOVTrack->entry is never incremented
for them. However, additional tests are added as needed to ignore
them.
Tested to produce valid output with:
ffmpeg -i movie.mp4 -i thumb.jpg -disposition✌️1 attached_pic \
-map 0 -map 1 -c copy movie-with-cover.mp4
The cover image is also copied correctly with:
ffmpeg -i movie-with-cover.mp4 -map 0 -c copy out.mp4
AtomicParseley says that the attached_pic stream is properly
not visible in the main tracks of the file.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
This reduces the need for an edit list; streams that start with
e.g. dts=-1, pts=0 can be encoded as dts=0, pts=0 (which is valid
in mov/mp4) by shifting the dts values of all packets forward.
This avoids the need for edit lists for such streams (while they
still are needed for audio streams with encoder delay).
This eases conformance with the DASH-IF interoperability guidelines.
Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This reduces the need for an edit list; streams that start with
e.g. dts=-1, pts=0 can be encoded as dts=0, pts=0 (which is valid
in mov/mp4) by shifting the dts values of all packets forward.
This avoids the need for edit lists for such streams (while they
still are needed for audio streams with encoder delay).
This eases conformance with the DASH-IF interoperability guidelines.
Signed-off-by: Martin Storsjö <martin@martin.st>
Sometimes it's useful to be able to define the exact track numbers in
the generated track, instead of always beginning at track id 1. Using
the option use_stream_ids_as_track_ids now copies the use stream ids
to track ids. Dynamically generated tracks (ie. tmcd) have their track
numbers defined as continuing from the highest numbered stream id.
Signed-off-by: Erkki Seppälä <erkki.seppala.ext@nokia.com>
Signed-off-by: OZOPlayer <OZOPL@nokia.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
When writing a fragmented file, we by default write an index pointing
to all the fragments at the end of the file. This causes constantly
increasing memory usage during the muxing. For live streams, the
index might not be useful at all.
A similar fragment index is written (but at the start of the file) if
the global_sidx flag is set. If ism_lookahead is set, we need to keep
data about the last ism_lookahead+1 fragments.
If no fragment index is to be written, we don't need to store information
about all fragments, avoiding increasing the memory consumption
linearly with the muxing runtime.
This fixes out of memory situations with long live mp4 streams.
Signed-off-by: Martin Storsjö <martin@martin.st>
Add -movflags use_metadata_tags to the mov muxer. This will cause
the muxer to write all metadata to the file in the keys and mtda
atoms.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
As long as caller only writes packets using av_interleaved_write_frame
with no manual flushing, this should allow us to always have accurate
durations at the end of fragments, since there should be at least
one queued packet in each stream (except for the stream where the
current packet is being written, but if the muxer itself does the
cutting of fragments, it also has info about the next packet for that
stream).
Signed-off-by: Martin Storsjö <martin@martin.st>
Currently, AVStream contains an embedded AVCodecContext instance, which
is used by demuxers to export stream parameters to the caller and by
muxers to receive stream parameters from the caller. It is also used
internally as the codec context that is passed to parsers.
In addition, it is also widely used by the callers as the decoding (when
demuxer) or encoding (when muxing) context, though this has been
officially discouraged since Libav 11.
There are multiple important problems with this approach:
- the fields in AVCodecContext are in general one of
* stream parameters
* codec options
* codec state
However, it's not clear which ones are which. It is consequently
unclear which fields are a demuxer allowed to set or a muxer allowed to
read. This leads to erratic behaviour depending on whether decoding or
encoding is being performed or not (and whether it uses the AVStream
embedded codec context).
- various synchronization issues arising from the fact that the same
context is used by several different APIs (muxers/demuxers,
parsers, bitstream filters and encoders/decoders) simultaneously, with
there being no clear rules for who can modify what and the different
processes being typically delayed with respect to each other.
- avformat_find_stream_info() making it necessary to support opening
and closing a single codec context multiple times, thus
complicating the semantics of freeing various allocated objects in the
codec context.
Those problems are resolved by replacing the AVStream embedded codec
context with a newly added AVCodecParameters instance, which stores only
the stream parameters exported by the demuxers or read by the muxers.
support writing encrypted mp4 using aes-ctr, conforming to ISO/IEC
23001-7.
3 new parameters were added:
- encryption_scheme - allowed values are none (default) and cenc-aes-ctr
- encryption_key - 128 bit encryption key (hex)
- encryption_kid - 128 bit encryption key identifier (hex)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The double meaning of the faststart flag (moving a moov atom
to the start of files, making them streamable, for non-fragmented
files, vs inserting a global sidx index at the start of files
for fragmented files) is confusing - see 40ed1cbf1 for
explanation of its origins.
Since the second meaning of the flag hasn't been part of any
libav release yet, just rename it to get rid of the confusion
without any extra deprecation (which wouldn't get rid of the
potential confusion, of users adding -movflags faststart
even for fragmented files, where it isn't needed for making
them "streamable").
This gets back the old behaviour, where -movflags faststart
doesn't have any effect for fragmented files.
Signed-off-by: Martin Storsjö <martin@martin.st>
The same field is also used for writing the sidx index header,
for fragmented files, when the faststart flag is used.
Signed-off-by: Martin Storsjö <martin@martin.st>
For strict CFR, they should be pretty much equal, but if the stream
is VFR, there can be a sometimes significant difference.
Calculate the pts duration separately, used in sidx atoms and for
tfrf/tfxd boxes in smooth streaming ismv files.
Also make sure to reduce the duration of sidx entries according to
edit lists.
Signed-off-by: Martin Storsjö <martin@martin.st>
Even if this is a guess, it is way better than writing a zero duration
of the last sample in a fragment (because if the duration is zero,
the first sample of the next fragment will have the same timestamp
as the last sample in the previous one).
Since we normally don't require libavformat muxer users to set
the duration field in AVPacket, we probably can't strictly require
it here either, so don't log this as a strict warning, only as info.
Signed-off-by: Martin Storsjö <martin@martin.st>
This is incompatible with the omit_tfhd_offset flag (writing
position independent fragments with interleaving requires the
default_base_moof flag).
This makes the moof atoms slightly bigger, but can be better for
playback (improving locality of sample data in the mdat).
Signed-off-by: Martin Storsjö <martin@martin.st>
This way, the caller doesn't need to coordinate setting the option
after the moov atom has been written. The downside is that it is
no longer possible to use the option for checking whether the moov
atom already has been written, but a caller is able to keep track
of that by other means anyway.
Signed-off-by: Martin Storsjö <martin@martin.st>
The previous use of the mov->fragments field, for determining whether
written packets were part of the first fragment or not, didn't
work as intended when using the empty_moov flag.
Signed-off-by: Martin Storsjö <martin@martin.st>
As this is depricated it should not be on by default, it is only
supported for MOV containers, depends on avpriv_get_gamma_from_trc()
Enable by:
-movflags +write_gama
This will use the color_trc to supply a gamma value, if desired an
explicit value may be supplied using the -mov_gamma option supplying
a suitable floating point value, values <=1e-6 will not be written.
Signed-off-by: Kevin Wheatley <kevin.j.wheatley@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Use the more generic approach with the delay_moov flag, instead of
having a update mechanism specific to this one single atom.
Signed-off-by: Martin Storsjö <martin@martin.st>
This delays writing the moov until the first fragment is written,
or can be flushed by the caller explicitly when wanted. If the first
sample in all streams is available at this point, we can write
a proper editlist at this point, allowing streams to start at
something else than dts=0. For AC3 and DNXHD, a packet is
needed in order to write the moov header properly.
This isn't added to the normal behaviour for empty_moov, since
the behaviour that ftyp+moov is written during avformat_write_header
would be changed. Callers that split the output stream into header+segments
(either by flushing manually, with the custom_frag flag set, or by
just differentiating between data written during avformat_write_header
and the rest) will need to be adjusted to take this option into use.
For handling streams that start at something else than dts=0, an
alternative would be to use different kinds of heuristics for
guessing the start dts (using AVCodecContext delay or has_b_frames
together with the frame rate), but this is not reliable and doesn't
necessarily work well with stream copy, and wouldn't work for getting
the right initialization data for AC3 or DNXHD either.
Signed-off-by: Martin Storsjö <martin@martin.st>
The pts and the corresponding duration is written in sidx
atoms, thus make sure these match up correctly.
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows creating a later mp4 fragment without sequentially
writing the earlier ones before (when called from a segmenter).
Normally when writing a fragmented mp4 file sequentially, the
first timestamps of a fragment are adjusted to match the
end of the previous fragment, to make sure the timestamp is the
same, even if it is calculated as the sum of previous fragment
durations. (And for the first packet in a file, the offset of
the first packet is written using an edit list.)
When writing an individual mp4 fragment discontinuously like this
(with potentially writing the earlier fragments separately later),
there's a risk of getting a gap in the timeline if the duration
field of the last packet in the previous fragment doesn't match up
with the start time of the next fragment.
Using this requires setting -avoid_negative_ts make_non_negative
(or -avoid_negative_ts 0).
Signed-off-by: Martin Storsjö <martin@martin.st>
This is mapped to the faststart flag (which in this case
perhaps should be called "shift and write index at the
start of the file"), which for fragmented files will
write a sidx index at the start.
When segmenting DASH into files, there's usually one sidx
at the start of each segment (although it's not clear to me
whether that actually is necessary). When storing all of it
in one file, the MPD doesn't necessarily need to describe
the individual segments, but the offsets of the fragments can be
fetched from one large sidx atom at the start of the file. This
allows creating files for the DASH ISO BMFF on-demand profile.
Signed-off-by: Martin Storsjö <martin@martin.st>