The current pack_output function pointer is a property of the decoder,
rather than a constant method provided by the DSP code. Indeed, except
for an unused initialisation, the field is never used in DSP code.
This is more spec-compliant because it does not rely
on dead-code elimination by the compiler. Especially
MSVC has problems with this, as can be seen in
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296373.html
or
https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/297022.html
This commit does not eliminate every instance where we rely
on dead code elimination: It only tackles branching to
the initialization of arch-specific dsp code, not e.g. all
uses of CONFIG_ and HAVE_ checks. But maybe it is already
enough to compile FFmpeg with MSVC with whole-programm-optimizations
enabled (if one does not disable too many components).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: left shift of negative value -2
Fixes: 20305/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TRUEHD_fuzzer-5677196618498048
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Jai Luthra <me@jailuthra.in>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The outputted bits should match what is used in the lossless check
Fixes: runtime error: signed integer overflow: -538697856 * 256 cannot be represented in type 'int'
Fixes: 4326/clusterfuzz-testcase-minimized-5689449645080576
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: runtime error: left shift of negative value -7862264
Fixes: 4074/clusterfuzz-testcase-minimized-4516104123711488
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Profiling results for overall audio decode and the mlp_filter_channel(_arm)
function in particular are as follows:
Before After
Mean StdDev Mean StdDev Confidence Change
6:2 total 380.4 22.0 370.8 17.0 87.4% +2.6% (insignificant)
6:2 function 60.7 7.2 36.6 8.1 100.0% +65.8%
8:2 total 357.0 17.5 343.2 19.0 97.8% +4.0% (insignificant)
8:2 function 60.3 8.8 37.3 3.8 100.0% +61.8%
6:6 total 717.2 23.2 658.4 15.7 100.0% +8.9%
6:6 function 140.4 12.9 81.5 9.2 100.0% +72.4%
8:8 total 981.9 16.2 896.2 24.5 100.0% +9.6%
8:8 function 193.4 15.0 103.3 11.5 100.0% +87.2%
Experiments with adding preload instructions to this function yielded no
useful benefit, so these have not been included.
The assembly version has also been tested with a fuzz tester to ensure that
any combinations of inputs not exercised by my available test streams still
generate mathematically identical results to the C version.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Profiling results for overall audio decode and the mlp_filter_channel(_arm)
function in particular are as follows:
Before After
Mean StdDev Mean StdDev Confidence Change
6:2 total 380.4 22.0 370.8 17.0 87.4% +2.6% (insignificant)
6:2 function 60.7 7.2 36.6 8.1 100.0% +65.8%
8:2 total 357.0 17.5 343.2 19.0 97.8% +4.0% (insignificant)
8:2 function 60.3 8.8 37.3 3.8 100.0% +61.8%
6:6 total 717.2 23.2 658.4 15.7 100.0% +8.9%
6:6 function 140.4 12.9 81.5 9.2 100.0% +72.4%
8:8 total 981.9 16.2 896.2 24.5 100.0% +9.6%
8:8 function 193.4 15.0 103.3 11.5 100.0% +87.2%
Experiments with adding preload instructions to this function yielded no
useful benefit, so these have not been included.
The assembly version has also been tested with a fuzz tester to ensure that
any combinations of inputs not exercised by my available test streams still
generate mathematically identical results to the C version.
Signed-off-by: Martin Storsjö <martin@martin.st>
MLP DSP functions x86-optimized.
12.59% overall speedup in x86_32
9.98% overall speedup in x86_64
compared to gcc 4.3.3
It breaks on x86_64 with PIC because the MANGLE macro appends (%%rip), which
is not wanted for labels.
Originally committed as revision 18904 to svn://svn.ffmpeg.org/ffmpeg/trunk
12.59% overall speedup in x86_32
9.98% overall speedup in x86_64
compared to gcc 4.3.3
Originally committed as revision 18903 to svn://svn.ffmpeg.org/ffmpeg/trunk