ff_square_tab is always used with an offset; if this table
is marked as hidden, the compiler can infer that it and
therefore also ff_square_tab + 256 have a fixed offset
from the code. This allows to avoid performing "+ 256"
at runtime by baking it into the offset from the code to
the table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
- ff_pix_abs16_neon
- ff_pix_abs16_xy2_neon
In direct micro benchmarks of these ff functions verses their C implementations,
these functions performed as follows on AWS Graviton 3.
ff_pix_abs16_neon:
pix_abs_0_0_c: 141.1
pix_abs_0_0_neon: 19.6
ff_pix_abs16_xy2_neon:
pix_abs_0_3_c: 269.1
pix_abs_0_3_neon: 39.3
Tested with:
./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf
Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
The usage of a static variable presents a potential for data races and
means that this function can't be used in init functions of codecs with
FF_CODEC_CAP_INIT_THREADSAFE (unless of course one presumes that
everything is alright in which case the error is not triggered; but then
the whole function is pointless...). This makes the Snow decoder
init-threadsafe as it already claims.
Notice that this function has been removed in 2014 by Libav in commit
9103185bd1, because only some codepaths
are checked this way and because it only affects legacy compilers. The
latter is of course even more true today.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This patch adds MSA (MIPS-SIMD-Arch) optimizations for me_cmp functions in new file me_cmp_msa.c
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Blackfin is a painful platform to work with, no test machines are available
and the range of multimedia applications is dubious. Thus it only represents
a maintenance burden.
The function is supposed to confirm that the compiler provided enough
alignment, but in practice it is only run in certain code paths and
insufficient alignment problems are restricted to legacy compilers.