Swinney, Jonathan
c471cc7474
lavc/aarch64: motion estimation functions in neon
...
- ff_pix_abs16_neon
- ff_pix_abs16_xy2_neon
In direct micro benchmarks of these ff functions verses their C implementations,
these functions performed as follows on AWS Graviton 3.
ff_pix_abs16_neon:
pix_abs_0_0_c: 141.1
pix_abs_0_0_neon: 19.6
ff_pix_abs16_xy2_neon:
pix_abs_0_3_c: 269.1
pix_abs_0_3_neon: 39.3
Tested with:
./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf
Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
3 years ago
Ben Avison
bd3615a81a
checkasm: Add idctdsp add/put-pixels-clamped tests
...
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
3 years ago
Ben Avison
20cb43ea8b
checkasm: Add vc1dsp in-loop deblocking filter tests
...
Note that the benchmarking results for these functions are highly dependent
upon the input data. Therefore, each function is benchmarked twice,
corresponding to the best and worst case complexity of the reference C
implementation. The performance of a real stream decode will fall somewhere
between these two extremes.
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
3 years ago
Mark Reid
9e445a5be2
swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functions
...
changes since v2:
* fixed label
changes since v1:
* remove vex intruction on sse4 path
* some load/pack marcos use less intructions
* fixed some typos
yuv2gbrp_full_X_4_512_c: 12757.6
yuv2gbrp_full_X_4_512_sse2: 8946.6
yuv2gbrp_full_X_4_512_sse4: 5138.6
yuv2gbrp_full_X_4_512_avx2: 3889.6
yuv2gbrap_full_X_4_512_c: 15368.6
yuv2gbrap_full_X_4_512_sse2: 11916.1
yuv2gbrap_full_X_4_512_sse4: 6294.6
yuv2gbrap_full_X_4_512_avx2: 3477.1
yuv2gbrp9be_full_X_4_512_c: 14381.6
yuv2gbrp9be_full_X_4_512_sse2: 9139.1
yuv2gbrp9be_full_X_4_512_sse4: 5150.1
yuv2gbrp9be_full_X_4_512_avx2: 2834.6
yuv2gbrp9le_full_X_4_512_c: 12990.1
yuv2gbrp9le_full_X_4_512_sse2: 9118.1
yuv2gbrp9le_full_X_4_512_sse4: 5132.1
yuv2gbrp9le_full_X_4_512_avx2: 2833.1
yuv2gbrp10be_full_X_4_512_c: 14401.6
yuv2gbrp10be_full_X_4_512_sse2: 9133.1
yuv2gbrp10be_full_X_4_512_sse4: 5126.1
yuv2gbrp10be_full_X_4_512_avx2: 2837.6
yuv2gbrp10le_full_X_4_512_c: 12718.1
yuv2gbrp10le_full_X_4_512_sse2: 9106.1
yuv2gbrp10le_full_X_4_512_sse4: 5120.1
yuv2gbrp10le_full_X_4_512_avx2: 2826.1
yuv2gbrap10be_full_X_4_512_c: 18535.6
yuv2gbrap10be_full_X_4_512_sse2: 33617.6
yuv2gbrap10be_full_X_4_512_sse4: 6264.1
yuv2gbrap10be_full_X_4_512_avx2: 3422.1
yuv2gbrap10le_full_X_4_512_c: 16724.1
yuv2gbrap10le_full_X_4_512_sse2: 11787.1
yuv2gbrap10le_full_X_4_512_sse4: 6282.1
yuv2gbrap10le_full_X_4_512_avx2: 3441.6
yuv2gbrp12be_full_X_4_512_c: 13723.6
yuv2gbrp12be_full_X_4_512_sse2: 9128.1
yuv2gbrp12be_full_X_4_512_sse4: 7997.6
yuv2gbrp12be_full_X_4_512_avx2: 2844.1
yuv2gbrp12le_full_X_4_512_c: 12257.1
yuv2gbrp12le_full_X_4_512_sse2: 9107.6
yuv2gbrp12le_full_X_4_512_sse4: 5142.6
yuv2gbrp12le_full_X_4_512_avx2: 2837.6
yuv2gbrap12be_full_X_4_512_c: 18511.1
yuv2gbrap12be_full_X_4_512_sse2: 12156.6
yuv2gbrap12be_full_X_4_512_sse4: 6251.1
yuv2gbrap12be_full_X_4_512_avx2: 3444.6
yuv2gbrap12le_full_X_4_512_c: 16687.1
yuv2gbrap12le_full_X_4_512_sse2: 11785.1
yuv2gbrap12le_full_X_4_512_sse4: 6243.6
yuv2gbrap12le_full_X_4_512_avx2: 3446.1
yuv2gbrp14be_full_X_4_512_c: 13690.6
yuv2gbrp14be_full_X_4_512_sse2: 9120.6
yuv2gbrp14be_full_X_4_512_sse4: 5138.1
yuv2gbrp14be_full_X_4_512_avx2: 2843.1
yuv2gbrp14le_full_X_4_512_c: 14995.6
yuv2gbrp14le_full_X_4_512_sse2: 9119.1
yuv2gbrp14le_full_X_4_512_sse4: 5126.1
yuv2gbrp14le_full_X_4_512_avx2: 2843.1
yuv2gbrp16be_full_X_4_512_c: 12367.1
yuv2gbrp16be_full_X_4_512_sse2: 8233.6
yuv2gbrp16be_full_X_4_512_sse4: 4820.1
yuv2gbrp16be_full_X_4_512_avx2: 2666.6
yuv2gbrp16le_full_X_4_512_c: 10904.1
yuv2gbrp16le_full_X_4_512_sse2: 8214.1
yuv2gbrp16le_full_X_4_512_sse4: 4824.1
yuv2gbrp16le_full_X_4_512_avx2: 2629.1
yuv2gbrap16be_full_X_4_512_c: 26569.6
yuv2gbrap16be_full_X_4_512_sse2: 10884.1
yuv2gbrap16be_full_X_4_512_sse4: 5488.1
yuv2gbrap16be_full_X_4_512_avx2: 3272.1
yuv2gbrap16le_full_X_4_512_c: 14010.1
yuv2gbrap16le_full_X_4_512_sse2: 10562.1
yuv2gbrap16le_full_X_4_512_sse4: 5463.6
yuv2gbrap16le_full_X_4_512_avx2: 3255.1
yuv2gbrpf32be_full_X_4_512_c: 14524.1
yuv2gbrpf32be_full_X_4_512_sse2: 8552.6
yuv2gbrpf32be_full_X_4_512_sse4: 4636.1
yuv2gbrpf32be_full_X_4_512_avx2: 2474.6
yuv2gbrpf32le_full_X_4_512_c: 13060.6
yuv2gbrpf32le_full_X_4_512_sse2: 9682.6
yuv2gbrpf32le_full_X_4_512_sse4: 4298.1
yuv2gbrpf32le_full_X_4_512_avx2: 2453.1
yuv2gbrapf32be_full_X_4_512_c: 18629.6
yuv2gbrapf32be_full_X_4_512_sse2: 11363.1
yuv2gbrapf32be_full_X_4_512_sse4: 15201.6
yuv2gbrapf32be_full_X_4_512_avx2: 3727.1
yuv2gbrapf32le_full_X_4_512_c: 16677.6
yuv2gbrapf32le_full_X_4_512_sse2: 10221.6
yuv2gbrapf32le_full_X_4_512_sse4: 5693.6
yuv2gbrapf32le_full_X_4_512_avx2: 3656.6
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
3 years ago
James Almer
272c293c02
fate/checkasm: add missing tests to FATE
...
Signed-off-by: James Almer <jamrial@gmail.com>
3 years ago
J. Dekker
b492cacffd
checkasm: collapse hevc pel tests
...
Also add to `make fate-checkasm' target.
Signed-off-by: J. Dekker <jdek@itanimul.li>
3 years ago
Lynne
1978b143eb
checkasm: add av_tx FFT SIMD testing code
...
This sadly required making changes to the code itself,
due to the same context needing to be reused for both versions.
The lookup table had to be duplicated for both versions.
4 years ago
Josh de Kock
5913cd4e6c
checkasm: add hscale test
...
This tests the hscale 8bpp to 14/18bpp functions with different filter
sizes.
Signed-off-by: Josh de Kock <josh@itanimul.li>
5 years ago
Ting Fu
9691e2a426
checkasm/vf_eq: add test for vf_eq
...
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
5 years ago
Lynne
4ce1e13b54
checkasm: add opusdsp tests
5 years ago
Ruiling Song
8f4963ad25
checkasm/vf_gblur: add test for horiz_slice simd
...
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
6 years ago
James Almer
f60ddb7310
fate/checkasm: add missing v210dec test
...
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
Carl Eugen Hoyos
96fc0cbfde
tests: Add EXESUF to program calls.
...
Fixes fate in Windows subsystem for Linux.
6 years ago
James Almer
ba89dc27b5
checkasm: add an af_afir test
...
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
6 years ago
Martin Vignali
a9a7ed4f27
checkasm/swscale : add test for rgb shuffle_bytes func
7 years ago
Yingming Fan
80798e3857
checkasm/hevc_sao : add hevc_sao for checkasm
...
Signed-off-by: James Almer <jamrial@gmail.com>
7 years ago
Martin Vignali
78b982d3b9
checkasm : add test for losslessvideoencdsp for diff bytes and sub_left_pred
7 years ago
James Almer
da03242778
Revert "checkasm/vf_interlace : add test for lowpass_line 8 and 16"
...
This reverts commit adff97be5e
.
It currently fails on Windows targets.
Signed-off-by: James Almer <jamrial@gmail.com>
7 years ago
Martin Vignali
adff97be5e
checkasm/vf_interlace : add test for lowpass_line 8 and 16
7 years ago
Martin Vignali
cefb7e0060
checkasm/vf_hflip : add test for vf_hflip byte and short simd
7 years ago
James Almer
bfd7f07b65
fate/checkasm: add missing target for vf_threshold test
...
Signed-off-by: James Almer <jamrial@gmail.com>
7 years ago
James Almer
7323c896b2
checkasm: add an exrdsp test
...
Signed-off-by: James Almer <jamrial@gmail.com>
7 years ago
Martin Storsjö
39e16ee228
Revert "fate: Skip the checkasm test if CONFIG_STATIC is disabled"
...
When we use dllexport properly for shared libraries on windows,
there's no longer any issue with linking the object files for
e.g. libavcodec statically into checkasm. (It's still not possible
to link the built object files for e.g. libavformat statically to
libavcodec though, since libavformat exepcts to load av_export_*
symbols from a DLL.)
This reverts commit 4e62b57ee0
.
Signed-off-by: Martin Storsjö <martin@martin.st>
7 years ago
James Almer
823cc7e25f
checkasm: add a g722dsp test
...
Signed-off-by: James Almer <jamrial@gmail.com>
8 years ago
James Almer
9878935927
fate: add fate-checkasm-sbrdsp target
...
Signed-off-by: James Almer <jamrial@gmail.com>
8 years ago
Clément Bœsch
edd041e64c
checkasm: add AAC PS tests
...
This includes various fixes and improvements from James Almer.
Signed-off-by: James Almer <jamrial@gmail.com>
8 years ago
James Almer
5b10f484e2
checkasm: add float_dsp tests
...
Ported from libavutil/tests/float_dsp.c
Signed-off-by: James Almer <jamrial@gmail.com>
8 years ago
James Almer
7b3cb953f7
checkasm: add fixed_dsp tests
...
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
8 years ago
Martin Storsjö
4e62b57ee0
fate: Skip the checkasm test if CONFIG_STATIC is disabled
...
When building DLLs with MSVC, CONFIG_STATIC is disabled (see
d66c52c2b3
for a more verbose explanation) since the built
object files can't be linked statically (which checkasm does).
This worked up until recently, only by luck.
Signed-off-by: Martin Storsjö <martin@martin.st>
8 years ago
Diego Biurrun
4537647c04
fate: checkasm: Split monolithic test into individual components
8 years ago
Janne Grunau
c9f8cfb6d9
fate: add checkasm target
10 years ago