These encoders have AVCodec.ch_layouts set, so ff_encode_preinit()
has already checked that the used channel layout is equivalent
to one of these native layouts. Therefore one can simply
compare the channel masks (with the added complication
that one has to use av_channel_layout_subset() to get it,
because the channel layout is not guaranteed to have
AV_CHANNEL_ORDER_NATIVE).
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encoder actually creates files with side channels, not back
channels. See thd_layout in mlp_parse.h.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The encoders using this have AVCodec.ch_layouts set, so that
this is checked generically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This encoder has AVCodec.ch_layouts set, so that this is checked
generically.
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
None of the decoders here have the AV_CODEC_CAP_CHANNEL_CONF set,
so that it is already checked generically that the number of channels
is positive.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The pcm_bluray encoder has AVCodec.ch_layouts set, so that
ff_encode_preinit() checks that the channel layout in use
is equivalent to one of the layouts from AVCodec.ch_layouts.
Yet equivalent is not the same as identical; in particular,
custom layouts equivalent to native layouts are possible
(and necessary if one wants to use the name/opaque fields
with an ordinary channel layout), so one must not simply
use AVChannelLayout.u.mask. Use av_channel_layout_subset()
instead.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This decoder does not have the AV_CODEC_CAP_CHANNEL_CONF set,
so that number of channels has to be set by the user before
avcodec_open2().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Now that it is ensured that the old and new channel count/layout
values coincide if the old ones are set, the consistency of the
AVChannelLayout (which is checked before we reach this point)
implies the consistency of the old values, making these checks
here dead code. So remove them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This ensures that if AVCodecContext.channels or
AVCodecContext.channel_layout are set, AVCodecContext.ch_layout
has the equivalent values after this block.
(In case these values are set inconsistently, the consistency check
for ch_layout below will error out.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In particular, check the provided channel layout for encoders
without AVCodec.ch_layouts set. This fixes an infinite loop
in the WavPack encoder (and maybe other issues in other encoders
as well) in case the channel count is zero.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The wrapper for the legacy channel layout API already sets
AVCodecContext.channels based upon AVCodecContext.channel_layout
if the latter is set while the former is unset.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_get_ref_perms_string() has been removed in
7e350379f8.
Reviewed-by: Nicolas George <george@nsup.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Provide optimized implementation for pix_median_abs8 function.
Performance comparison tests are shown below.
- median_sad_1_c: 277.0
- median_sad_1_neon: 82.0
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation for vsad8_intra function.
Performance comparison tests are shown below.
- vsad_5_c: 94.7
- vsad_5_neon: 20.7
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Provide optimized implementation for pix_median_abs16 function.
Performance comparison tests are shown below.
- median_sad_0_c: 720.5
- median_sad_0_neon: 127.2
Benchmarks and tests run with checkasm tool on AWS Graviton 3.
Signed-off-by: Hubert Mazur <hum@semihalf.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
When determining whether a packet should be decrypted,
should use the stsd_id of the fragment where the current packet is located.
Reviewed-by: Zhao Zhili <zhilizhao@tencent.com>
Signed-off-by: Wang Yaqiang <wangyaqiang03@kuaishou.com>
Old one was written with the assumption only even inputs would be given.
This very messy replacement supports even and odd inputs, and supports
AVX2 for extra speed. The buffers given are usually quite big (4k samples),
so the speedup is worth it.
The new SSE version is still faster than the old inline asm version by 33%.
Also checkasm is provided to make sure this monstrosity works.
This fixes some FATE tests.
Clang's static analyzer complains that leaving the variable
uninitialized could lead to a code path where the uninitialized value is
written to at the end of this function.
This patch simply zero-initializes that variable to avoid that.
Signed-off-by: Will Cassella <cassew@google.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The aim of this test is to show the interleavement
of the file generated in the first pass; so make the
interleavement queue in the framecrc muxer in the second
pass as small as possible so that the framecrc muxer does not
fix wrong interleavement of the input file behind our backs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
enc_dec is designed for raw input and output and computes
the PSNR between these two. The input of the shortest-sub
test is the idx file of a vobsub sub+idx combination
and the output is the output of framecrc of said vobsub
subtitle muxed into Matroska together with a synthesized
video. Calculating the PSNR between these two files makes
no sense, therefore switch to a transcode test, where
the ref file file contains the output of framecrc directly,
making the interleavement better visible in the ref file
at the cost of a larger ref file (>400 lines).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The MXF demuxer does not currently set AVStream::avg_frame_rate and ::r_frame_rate
when J2K essence is wrapped according to SMPTE ST 422.
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also covers writing mastering display metadata.
Reviewed-by: Tomas Härdin <tjoppen@acc.umu.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Basically reverts af15c17daa.
Flipping a picture by modifying the pointers is so common
that even users of direct rendering should take it into account.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, libswscale/output.c used a macro to write
an output pixel which involved a call to av_pix_fmt_desc_get()
to find out whether the input pixel format is BE or LE
despite this being known at compile-time (there are templates
per pixfmt). Even worse, these calls are made in a loop,
so that e.g. there are eight calls to av_pix_fmt_desc_get()
for every pixel processed in yuv2rgba64_X_c_template()
for 64bit RGB formats.
This commit modifies these macros to ensure that isBE()
is evaluated at compile-time. This saved 41184B of .text
for me (GCC 11.2, -O3). Of course, it also improved performance.
E.g. ffmpeg_g -f lavfi -i testsrc2,format=yuva420p -pix_fmt rgba64le \
-threads 1 -t 1:00 -f null - (which uses yuv2rgba64le_X_c,
which is an invocation of yuv2rgba64_X_c_template() mentioned above),
performance improved from 95589 to 41387 decicycles for one call
to yuv2packedX; for the be variant the numbers went down from
76087 to 43024 decicycles.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, libswscale/input.c used a macro to read
an input pixel which involved a call to av_pix_fmt_desc_get()
to find out whether the input pixel format is BE or LE
despite this being known at compile-time (there are templates
per pixfmt). Even worse, these calls are made in a loop,
so that e.g. there are six calls to av_pix_fmt_desc_get()
for every pair of UV pixel processed in
rgb64ToUV_half_c_template().
This commit modifies these macros to ensure that isBE()
is evaluated at compile-time. This saved 9743B of .text
for me (GCC 11.2, -O3). For a simple RGB64LE->YUV420P
transformation like
ffmpeg -f lavfi -i haldclutsrc,format=rgba64le -pix_fmt yuv420p \
-threads 1 -t 1:00 -f null -
the amount of decicycles spent in rgb64LEToUV_half_c
(which is created via the template mentioned above)
decreases from 19751 to 5341; for RGBA64BE the number
went down from 11945 to 5393. For shared builds (where
the call to av_pix_fmt_desc_get() is indirect) the old numbers
are 15230 for RGBA64BE and 27502 for RGBA64LE, whereas
the numbers with this patch are indistinguishable from
the numbers from a static build.
Also make the macros that are touched conform to the
usual convention of using uppercase names while just at it.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, using NULL as key in av_dict_get() on a non-empty
AVDictionary would crash; using NULL as key in av_dict_set()
would also crash for a non-empty AVDictionary unless AV_DICT_MULTIKEY
was set; in case the dictionary was initially empty or AV_DICT_MULTIKEY
was set, it was even possible for av_dict_set() to succeed when
adding a NULL key, namely when one uses a value != NULL and
the AV_DICT_DONT_STRDUP_VAL flag. Using av_dict_get() on such
an AVDictionary will usually lead to crashes, though.
Fix this by actually checking for key in both functions; error out
if they are NULL.
While just at it, also stop relying on av_strdup(NULL) to return NULL
in av_dict_set().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The compiler cannot infer that the two float vectors do not alias,
causing unnecessary extra loads and serialisation. This patch caches
the two input values in local variables so that compiler can optimise
individual loop iterations.
While this probably never overflows, we are better safe than sorry.
The callback prototype should probably also use ptrdiff_t or size_t,
but I diggress (this would affect the DSP callback prototype).