This also increases the range of input values supported as well as
decreasing the operation dependencies in the main loop, improving
speed on modern CPUs.
Fixes part of: 2045/clusterfuzz-testcase-minimized-6751255865065472
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>