This patch replaces the transform used in AAC with lavu/tx and removes
the limitation on only being able to decode 960-sample files
with the float decoder.
This commit also removes a whole bunch of unnecessary and slow
lifting steps the decoder did to compensate for the poor accuracy
of the old integer transformation code.
Overall float decoder speedup on Zen 3 for 64kbps: 32%
dts != pts is actually a spec violation for AV1, given it has no
reordering in the classical sense.
We don't really need the whole timestamp queue in this case and can just
pass through the timestamp as is for both dts and pts.
The encoder seems to be trading blows with hevc_nvenc.
In terms of quality at low bitrate cbr settings, it seems to
outperform it even. It produces fewer artifacts and the ones it
does produce are less jarring to my perception.
At higher bitrates I had a hard time finding differences between
the two encoders in terms of subjective visual quality.
Using the 'slow' preset, av1_nvenc outperformed hevc_nvenc in terms
of encoding speed by 75% to 100% while performing above tests.
Needless to say, it always massively outperformed h264_nvenc in terms
of quality for a given bitrate, while also being slightly faster.
Negligible speed difference for avx2 on Zen 2 (Ryzen 5700X) and
Broadwell (Xeon E5-2620 v4):
1690±4.3 decicycles vs. 1693±78.4
1439±31.1 decicycles vs 1429±16.7
Moderate speedup with avx512 on Skylake-X (Xeon D-2123IT):
1.22x faster (793±0.8 vs. 649±5.5 decicycles) compared with avx2
Better speedup with avx512icl on Ice Lake (Xeon Silver 4316):
1.77x faster (784±1.8 vs. 442±11.6 decicycles) compared with avx2
Co-authors:
Henrik Gramner <henrik@gramner.com>
Kieran Kunhya <kierank@obe.tv>
In av1_spec.pdf page 38/669, there is a sentence below:
if ( frame_type == KEY_FRAME && show_frame ) {
for ( i = 0; i < NUM_REF_FRAMES; i++) {
RefValid[ i ] = 0
......
}
......
}
This shows that the condition of invalidating current
DPB frames should be the coming frame_type is KEY_FRAME plus
show_frame is equal to 1. Otherwise, some of the frames
in sequence after KEY_FRAME still refer to the reference frames
before KEY_FRAME, and if these before KEY_FRAME reference
frames were invalidated, these frames could not find their
reference frames, and it could cause image corruption.
Mesa fix is in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19386
Reviewed-by: Fei Wang <fei.w.wang@intel.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
The order in which the channels are coded in the bitstream do not always follow
the native, bitmask-based order of channels both signaled by the WAV container
and forced by this same decoder. This is the case with layouts containing an
LFE channel, as it's always coded last.
Fixes ticket #9964.
Signed-off-by: James Almer <jamrial@gmail.com>
Generalize the checks for channels in all positions, and properly support
the three height groups (normal, top, bottom) instead of manually setting
the relevant channels for the latter two after the normal height tags were
parsed.
Signed-off-by: James Almer <jamrial@gmail.com>
If PCE defines channels not covered by those in the standard configurations
then don't try to come up with some made up layout and just return them in the
coded order.
Fixes al08_44.mp4 from the conformance suite, now reporting and decoding all 48
channels instead of 10.
Signed-off-by: James Almer <jamrial@gmail.com>
Set the correct amount of tags in tags_per_config[].
Also, there are no channels that correspond to a side element in this
configuration, so reflect this in the list of known/supported channel layouts.
Signed-off-by: James Almer <jamrial@gmail.com>
The FFmpeg encoder does not increment the temporal reference field, so use
that knowledge plus the extradata field to detect buggy versions of the encoder.
Fixes ticket #128.
The SVQ1 interframe mean VLC symbols -128 and 128 are incorrectly swapped
in our SVQ1 implementation, resulting in visible artifacts for some videos.
This patch unswaps the order of these two symbols.
The most noticable example of the artiacts caused by this error can be observed in
https://trac.ffmpeg.org/attachment/ticket/128/svq1_set.7z '352_288_k_50.mov'.
The artifacts are not observed when using the reference decoder
(QuickTime 7.7.9 x86 binary).
As a result of this patch, the reference data for the fate-svq1 test
($SAMPLES/svq1/marymary-shackles.mov) must be modified. For this file, our
decoder output is now bitwise identical to the reference decoder. I have
tested patch with various other samples and they are all now bitwise identical.
ff_mpeg1_dc_scale_table is the default value for
[yc]_dc_scale_table (as set by ff_mpv_common_defaults()).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This e.g. allows compilers to bake the offset implied
by using ff_mpeg12_dc_scale_table[3] (as the SpeedHQ encoder
does) into the general offset; for certain arches this is
also necessary in order to avoid building suboptimal code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These tables are only accessed in ff_set_qscale()
which only accesses values 1..31 as well as in
encode_picture() in mpegvideo_enc.c, accessing
the value with index 8. So make these tables smaller.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For encoders, mpeg_quant is an option of the MPEG-4 encoder
and therefore constant. This implies that one can set
the dct_unquantize_(intra|inter) function pointers during init.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This basically reverts cc0380222a.
At the time of said commit, cleanup on init failure was very
buggy. For codecs with the INIT_CLEANUP cap, the close function
could be called on error even before the private data has been
allocated; and when using frame threading the same could also
happen even without said flag. Some mpegvideo decoders were
affected by the latter.
Yet both of these issues have been fixed long ago, the latter
in commit e9b6617579. Therefore
the workaround in ff_mpv_common_end() can be removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It avoids checks and allows to make ff_wmv2_decode_mb() static;
furthermore, it allows to avoid a config_components.h inclusion.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Only encoders need two sets of int16_t [12][64]
(one to save the current best state and one for the current
working state); decoders need only one. This saves 1.5KiB
per slice context for a decoder.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only used by metasound.c, so this allows to make
the data static.
To do this, remove metasound_data.h, rename metasound_data.c
into metasound_data.h (and add inclusion guards, remove ff_
prefixes etc.).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Namely into a header metasound_twinvq_data.h included
in twinvq.c (the common file of MetaSound and TwinVQ).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_rl_init() initializes RLTable.(max_level|max_run|index_run);
max_run is unused by the SpeedHQ encoder (as well as MPEG-1/2).
Furthermore, it initializes these things twice (for two passes),
but the second half of this is never used.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_rl_init() initializes RLTable.(max_level|max_run|index_run);
max_run is unused by the MPEG-1/2 encoders (as well as SpeedHQ).
Furthermore, it initializes these things twice (for two passes),
but the second half of this is never used.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>