FFmpeg

Commit Graph

Author	SHA1	Message	Date
Michael Niedermayer	38f966b222	tests/checkasm/float_dsp: Increase allowed difference for float_dsp.vector_dmul Tested for 10000 iterations on x86-32 Fixes: Ticket6848 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	7 years ago
James Almer	bea8eeaa2c	checkasm/utvideodsp: zero initialize the entire buffer Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
James Almer	9a05c873cf	checkasm/utvideodsp: fix mixed declarations and code Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
Martin Vignali	4a6aa6d1b2	checkasm : add test for huffyuvdsp add_int16	7 years ago
Martin Vignali	6a7eb65e1b	checkasm : add utvideodsp test	7 years ago
James Almer	501435e5e6	checkasm/jpeg2000dsp: add test for ict_float Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
James Almer	20a93ea8d4	checkasm/jpeg2000dsp: refactor rct_int test Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
James Almer	4cfb46f94f	checkasm/llviddsp: fix warnings about mixed declaration and code Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
Martin Vignali	fbe9148779	checkasm/llviddsp : add test for other dsp func add_median_pred add_left_pred : add two func one with acc 0, and one with random acc add_left_pred16	7 years ago
Martin Vignali	cbbec68847	libavcodec/blockdsp : add AVX version Also modify the required alignment, to 32 instead of 16 for several codecs Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
Martin Vignali	ac5908b13f	libavcodec/exr : add x86 SIMD for predictor Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
Martin Storsjö	516c479172	checkasm: Test more h264 idct variants Signed-off-by: Martin Storsjö <martin@martin.st>	7 years ago
James Almer	7323c896b2	checkasm: add an exrdsp test Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
Clément Bœsch	e0d56f097f	checkasm: use perf API on Linux ARM* On ARM platforms, accessing the PMU registers requires special user access permissions. Since there is no other way to get accurate timers, the current implementation of timers in FFmpeg rely on these registers. Unfortunately, enabling user access to these registers on Linux is not trivial, and generally involve compiling a random and unreliable github kernel module, or patching somehow your kernel. Such module is very unlikely to reach the upstream anytime soon. Quoting Robin Murphin from ARM: > Say you do give userspace direct access to the PMU; now run two or more > programs at once that believe they can use the counters for their own > "minimal-overhead" profiling. Have fun interpreting those results... > > And that's not even getting into the implications of scheduling across > different CPUs, CPUidle, etc. where the PMU state is completely beyond > userspace's control. In general, the plan to provide userspace with > something which might happen to just about work in a few corner cases, > but is meaningless, misleading or downright broken in all others, is to > never do so. As a result, the alternative is to use the Performance Monitoring Linux API which makes use of these registers internally (assuming the PMU of your ARM board is supported in the kernel, which is definitely not a given...). While the Linux API is obviously cross platform, it does have a significant overhead which needs to be taken into account. As a result, that mode is only weakly enabled on ARM platforms exclusively. Note on the non flexibility of the implementation: the timers (native FFmpeg vs Linux API) are selected at compilation time to prevent the need of function calls, which would result in a negative impact on the cycle counters.	7 years ago
Martin Storsjö	e12f1cd616	Revert "checkasm: Test more h264 idct variants" This reverts commit `547db1eaec`. This commit wasn't supposed to be pushed (yet) since it hasn't been reviewed. Signed-off-by: Martin Storsjö <martin@martin.st>	7 years ago
Martin Storsjö	547db1eaec	checkasm: Test more h264 idct variants	7 years ago
James Almer	e51073fe00	checkasm/vf_blend: rename addition128 and difference128 to grainmerge and grainextract This was missing from `f8d0689d3f`. Fixes checkasm.	7 years ago
James Almer	6f205a42d7	checkasm: add hybrid_analysis_ileave and hybrid_synthesis_deint tests to aacpsdsp Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
James Almer	823cc7e25f	checkasm: add a g722dsp test Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
James Almer	3d3243577c	checkasm: use declare_func_float() in sbrdsp sum_square test The function returns a float. This fixes the test in x86_32 targets. Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
Matthieu Bouron	7864e07f4a	checkasm: add sbrdsp tests	7 years ago
James Almer	0eb783eb06	checkasm: randomize the full input buffer in test_hybrid_analysis Missed in the last commit.	7 years ago
James Almer	fb7b477a91	checkasm: fix size of input buffer in test_hybrid_analysis	7 years ago
Clément Bœsch	b12a36170b	lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysis	7 years ago
Clément Bœsch	edd041e64c	checkasm: add AAC PS tests This includes various fixes and improvements from James Almer. Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
James Almer	fa50d9360b	x86/vf_blend: add sse and ssse3 extremity functions Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	7 years ago
James Almer	a579dbb4f7	checkasm: add missing checks to float_dsp's butterflies_float test	8 years ago
Matthieu Bouron	067e42b851	checkasm/aarch64: fix tests returning a float Avoids overriding the v0 register (which containins the result of the tested function) in checkasm_call_checked.	8 years ago
Diego Biurrun	fd502f4f5f	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler. (Cherry-picked from libav commit `39e208f4d4`) Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
James Almer	5b10f484e2	checkasm: add float_dsp tests Ported from libavutil/tests/float_dsp.c Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
James Almer	37388b119c	checkasm: add a checkasm_checked_call function that doesn't issue emms Meant for DSP functions returning a float or double, as they'd fail if emms is called after every run on x86_32. Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
James Almer	93dc1c1221	checkasm: add _fixed suffix to fixed_dsp tests Should prevents future conflicts with the similarly named floatdsp tests	8 years ago
Martin Storsjö	d05c9cde0e	checkasm: aarch64: Specify alignment for the register_init const array Loads from this strictly doesn't require alignment, but specify it just for consistency with the arm version. Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago
Martin Storsjö	e00db9f78b	checkasm: hevc: Add a hevc_ prefix to the add_residual functions This makes it easier to group them with the rest when running e.g. --bench=hevc. Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago
James Almer	7b3cb953f7	checkasm: add fixed_dsp tests Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
Clément Bœsch	1c9f4b5078	lavc/vp9: split into vp9{block,data,mvs} This is following Libav layout to ease merges.	8 years ago
James Almer	09ce5519f3	fate/checkasm: fix use of uninitialized memory on hevc_add_res tests	8 years ago
James Almer	36eae45510	fate/checkasm: use LOCAL_ALINGED_32 on hevc_add_res tests	8 years ago
Diego Biurrun	dcc39ee10e	lavc: Remove deprecated XvMC support hacks Deprecated in 11/2013.	8 years ago
James Almer	30cadfe071	avcodec/lossless_videodsp: use ptrdiff_t for length parameters Signed-off-by: James Almer <jamrial@gmail.com>	8 years ago
Diego Biurrun	39e208f4d4	build: Generalize yasm/nasm-related variable names None of them are specific to the YASM assembler.	8 years ago
Diego Biurrun	7cb1d9e2db	build: Fine-grained link-time dependency settings Previously, all link-time dependencies were added for all libraries, resulting in bogus link-time dependencies since not all dependencies are shared across libraries. Also, in some cases like libavutil, not all dependencies were taken into account, resulting in some cases of underlinking. To address all this mess a machinery is added for tracking which dependency belongs to which library component and then leveraged to determine correct dependencies for all individual libraries.	8 years ago
Martin Storsjö	388f6e6715	arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 This work is sponsored by, and copyright, Google. Previously all subpartitions except the eob=1 (DC) case ran with the same runtime: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0 vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3 By skipping individual 4x16 or 4x32 pixel slices in the first pass, we reduce the runtime of these functions like this: vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8 vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7 vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5 vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7 vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1 vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1 vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9 vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8 vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5 vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2 vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6 vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6 vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0 vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7 vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1 I.e. in general a very minor overhead for the full subpartition case due to the additional loads and cmps, but a significant speedup for the cases when we only need to process a small part of the actual input data. In common VP9 content in a few inspected clips, 70-90% of the non-dc-only 16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left 8x8 or 16x16 subpartitions respectively. This is cherrypicked from libav commit `9c8bc74c2b`. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	8 years ago
Ronald S. Bultje	1c8fbd7b90	checkasm/vp9: benchmark all sub-IDCTs (but not WHT or ADST).	8 years ago
Diego Biurrun	3794062ab1	Remove Plan 9 support Supporting the system was a nice joke for the 9 release, but it has run its course. Nowadays Plan 9 receives no testing and has no practical usefulness.	8 years ago
Martin Storsjö	9c8bc74c2b	arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32 This work is sponsored by, and copyright, Google. Previously all subpartitions except the eob=1 (DC) case ran with the same runtime: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub16_add_neon: 3188.1 2435.4 2499.0 1969.0 vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.7 16582.3 14207.6 12000.3 By skipping individual 4x16 or 4x32 pixel slices in the first pass, we reduce the runtime of these functions like this: vp9_inv_dct_dct_16x16_sub1_add_neon: 274.6 189.5 211.7 235.8 vp9_inv_dct_dct_16x16_sub2_add_neon: 2064.0 1534.8 1719.4 1248.7 vp9_inv_dct_dct_16x16_sub4_add_neon: 2135.0 1477.2 1736.3 1249.5 vp9_inv_dct_dct_16x16_sub8_add_neon: 2446.7 1828.7 1993.6 1494.7 vp9_inv_dct_dct_16x16_sub12_add_neon: 2832.4 2118.3 2266.5 1735.1 vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.7 2475.3 2523.5 1983.1 vp9_inv_dct_dct_32x32_sub1_add_neon: 756.2 456.7 862.0 553.9 vp9_inv_dct_dct_32x32_sub2_add_neon: 10682.2 8190.4 8539.2 6762.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 10813.5 8014.9 8518.3 6762.8 vp9_inv_dct_dct_32x32_sub8_add_neon: 11859.6 9313.0 9347.4 7514.5 vp9_inv_dct_dct_32x32_sub12_add_neon: 12946.6 10752.4 10192.2 8280.2 vp9_inv_dct_dct_32x32_sub16_add_neon: 14074.6 11946.5 11001.4 9008.6 vp9_inv_dct_dct_32x32_sub20_add_neon: 15269.9 13662.7 11816.1 9762.6 vp9_inv_dct_dct_32x32_sub24_add_neon: 16327.9 14940.1 12626.7 10516.0 vp9_inv_dct_dct_32x32_sub28_add_neon: 17462.7 15776.1 13446.2 11264.7 vp9_inv_dct_dct_32x32_sub32_add_neon: 18575.5 17157.0 14249.3 12015.1 I.e. in general a very minor overhead for the full subpartition case due to the additional loads and cmps, but a significant speedup for the cases when we only need to process a small part of the actual input data. In common VP9 content in a few inspected clips, 70-90% of the non-dc-only 16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left 8x8 or 16x16 subpartitions respectively. Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago
Ronald S. Bultje	06fec74cac	checkasm: vp9dsp: benchmark all sub-IDCTs (but not WHT or ADST). Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago
Martin Storsjö	effc1430b2	Revert "checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately" This reverts commit `81d7f0bbca`. Instead of just benchmarking dc separately, test all relevant subparts (in the next commit). Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago
Martin Storsjö	81d7f0bbca	checkasm: vp9dsp: Benchmark the dc-only version of idct_idct separately The dc-only mode is already checked to work correctly above, but this allows benchmarking this mode for performance tuning, and allows making sure that it actually is correctly hooked up. Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago
Ronald S. Bultje	0b37cd09a6	checkasm: add vp9dsp.itxfm_add tests. This includes fixes by Henrik Gramner. The forward transforms are derived from the reference encoder. Signed-off-by: Martin Storsjö <martin@martin.st>	8 years ago

1 2 3 4 5

237 Commits (8d51d10eb895bda02ab0f8b3af082b5c9a781690)