Mediates between old-style (de)muxers and new-style callers. Will be
removed once all the (de)muxers are converted to the new API.
Signed-off-by: James Almer <jamrial@gmail.com>
They are incompatible with the new channel layout scheme and no decoder
uses them.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The muxer seems to have had one seemingly accidental use of
LIBAVCODEC_IDENT, while LIBAVFORMAT_IDENT probably is the
relevant one (which is used multiple times in the same file).
Signed-off-by: Martin Storsjö <martin@martin.st>
This commit adds support for storing DFPWM audio in a WAV container.
It uses the WAVEFORMATEXTENSIBLE structure, following these conventions:
https://gist.github.com/MCJack123/90c24b64c8e626c7f130b57e9800962c
The implementation is very simple: it just adds the GUID to the list of
WAV GUIDs, and modifies the WAV muxer to always use WAVEFORMATEXTENSIBLE
format with that GUID.
This creates a standard container format for DFPWM besides raw data.
It will allow users to transfer DFPWM audio in a standard container
format, with the sample rate and channel count contained in the file
as opposed to being an external parameter as in the raw format.
This format is already supported in my AUKit library, which is the CC
analog to libav (albeit much smaller). Support in other applications is TBD.
Signed-off-by: Jack Bruienne <jackbruienne@gmail.com>
This patch builds on my previous DFPWM codec patch, adding a raw
audio format to be able to read/write the raw files that are most commonly
used (as no other container format supports it yet).
The muxers are mostly copied from the PCM demuxer and the raw muxers, as
DFPWM is typically stored as raw data.
Please see the previous patch for more information on DFPWM.
Signed-off-by: Jack Bruienne <jackbruienne@gmail.com>
Fixes: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
Fixes: Ticket8486
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This was tested with medias recorded from an iPhone XR and an iPhone 13.
Here is how a typical stream looks like in coding order:
┌────────┬─────┬─────┬──────────┐
│ sample | PTS | DTS | keyframe |
├────────┼─────┼─────┼──────────┤
┊ ┊ ┊ ┊ ┊
│ 53 │ 560 │ 510 │ No │
│ 54 │ 540 │ 520 │ No │
│ 55 │ 530 │ 530 │ No │
│ 56 │ 550 │ 540 │ No │
│ 57 │ 600 │ 550 │ Yes │
│ * 58 │ 580 │ 560 │ No │
│ * 59 │ 570 │ 570 │ No │
│ * 60 │ 590 │ 580 │ No │
│ 61 │ 640 │ 590 │ No │
│ 62 │ 620 │ 600 │ No │
┊ ┊ ┊ ┊ ┊
In composition/display order:
┌────────┬─────┬─────┬──────────┐
│ sample | PTS | DTS | keyframe |
├────────┼─────┼─────┼──────────┤
┊ ┊ ┊ ┊ ┊
│ 55 │ 530 │ 530 │ No │
│ 54 │ 540 │ 520 │ No │
│ 56 │ 550 │ 540 │ No │
│ 53 │ 560 │ 510 │ No │
│ * 59 │ 570 │ 570 │ No │
│ * 58 │ 580 │ 560 │ No │
│ * 60 │ 590 │ 580 │ No │
│ 57 │ 600 │ 550 │ Yes │
│ 63 │ 610 │ 610 │ No │
│ 62 │ 620 │ 600 │ No │
┊ ┊ ┊ ┊ ┊
Sample/frame 58, 59 and 60 are B-frames which actually depends on the
key frame (57). Here the key frame is not an IDR but a "CRA" (Clean
Random Access).
Initially, I thought I could rely on the sdtp box (independent and
disposable samples), but unfortunately:
sdtp[54] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[55] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[56] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[57] is_leading:0 sample_depends_on:2 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[58] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[59] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[60] is_leading:0 sample_depends_on:1 sample_is_depended_on:2 sample_has_redundancy:0
sdtp[61] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
sdtp[62] is_leading:0 sample_depends_on:1 sample_is_depended_on:0 sample_has_redundancy:0
The information that might have been useful here would have been
is_leading, but all the samples are set to 0 so this was unusable.
Instead, we need to rely on sgpd/sbgp tables. In my case the video track
contained 3 sgpd tables with the following grouping types: tscl, sync
and tsas. In the sync table we have the following 2 entries (only):
sgpd.sync[1]: sync nal_unit_type:0x14
sgpd.sync[2]: sync nal_unit_type:0x15
(The count starts at 1 because 0 carries the undefined semantic, we'll
see that later in the reference table).
The NAL unit types presented here correspond to:
libavcodec/hevc.h: HEVC_NAL_IDR_N_LP = 20,
libavcodec/hevc.h: HEVC_NAL_CRA_NUT = 21,
In parallel, the sbgp sync table contains the following:
┌────┬───────┬─────┐
│ id │ count │ gdi │
├────┼───────┼─────┤
│ 0 │ 1 │ 1 │
│ 1 │ 56 │ 0 │
│ 2 │ 1 │ 2 │
│ 3 │ 59 │ 0 │
│ 4 │ 1 │ 2 │
│ 5 │ 59 │ 0 │
│ 6 │ 1 │ 2 │
│ 7 │ 59 │ 0 │
│ 8 │ 1 │ 2 │
│ 9 │ 59 │ 0 │
│ 10 │ 1 │ 2 │
│ 11 │ 11 │ 0 │
└────┴───────┴─────┘
The gdi column (group description index) directly refers to the index in
the sgpd.sync table. This means the first frame is an IDR, then we have
batches of undefined frames interlaced with CRA frames. No IDR ever
appears again (tried on a 30+ seconds sample).
With that information, we can build an heuristic using the presentation
order.
A few things needed to be introduced in this commit:
1. min_sample_duration is extracted from the stts: we need the minimal
step between sample in order to PTS-step backward to a valid point
2. In order to avoid a loop over the ctts table systematically during a
seek, we build an expanded list of sample offsets which will be used
to translate from DTS to PTS
3. An open_key_samples index to keep track of all the non-IDR key
frames; for now it only supports HEVC CRA frames. We should probably
add BLA frames as well, but I don't have any sample so I prefered to
leave that for later
It is entirely possible I missed something obvious in my approach, but I
couldn't come up with a better solution. Also, as mentioned in the diff,
we could optimize is_open_key_sample(), but the linear scaling overhead
should be fine for now since it only happens in seek events.
Fixing this issue prevents sending broken packets to the decoder. With
FFmpeg hevc decoder the frames are skipped, with VideoToolbox the frames
are glitching.
sgpd means Sample Group Description Box.
For now, only the sync grouping type is parsed, but the function can
easily be adjusted to support other flavours.
The sbgp (Sample to Group Box) sync_group table built in previous commit
contains references to this table through the group_description_index
field.
If _FieldBased, _Matrix, _ColorRange, or _ChromaLocation haven't
been set, that absence would be interpreted as 0, leading to those
being set to case 0 instead of default. There is no case 0 for
_Primaries and _Transfer, so those were correctly falling back
to the default case.
Signed-off-by: Stephen Hutchinson <qyot27@gmail.com>
It appears this is not allowed "Each Segment Index box documents how a (sub)segment is divided into one or more subsegments
(which may themselves be further subdivided using Segment Index boxes)."
Fixes: Null pointer dereference
Fixes: Ticket9517
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
It's needed for avformat_get_mov_video_tags() and avformat_get_mov_audio_tags(),
both public symbols defined in avformat.h
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes: -nan is outside the range of representable values of type 'long'
Fixes: 44614/clusterfuzz-testcase-minimized-ffmpeg_dem_WEBM_DASH_MANIFEST_fuzzer-6216204841254912
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>