Because in contrast to the decoder, the parser does not setup low_delay.
The code in parse_nal_units would always end up setting has_b_frames
to "1", except when stream is explicitly marked as low delay.
Since the parser itself would create 'extradata', simply reopening
the parser would cause this.
This happens for instance in estimate_timings_from_pts(), which causes the
parser to be reopened on the same stream.
This fixes Libav #22 and FFmpeg (trac) #360
CC: libav-stable@libav.org
Based on a patch by Reimar Döffinger <Reimar.Doeffinger@gmx.de>
(commit 31ac0ac29b)
Comments and description adapted by Reinhard Tartler.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
It is not allowed to change mid-stream like it does currently. Instead we need
to buffer the first 8 frames before returning them as a single packet, then
only return single frame packets after that.
Prevents crash when trying to copy from a non-existing plane in e.g.
a RGB32 reference image to a YUV420P target image
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Instead of clipping extrasize based on EXTRABYTES, clip based on the
amount of buffer actually left. Without this fix, there are warbles
and other distortions in the test case below.
http://kevincennis.com/mix/assets/sounds/1901_voxfx.mp3
This prevents crashes when trying to read beyond the end of the buffer
while decoding frame data.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Unrolling the main loop to process, instead of 4 elements:
- 8: minor gain of 2 cycles (not worth the extra object size)
- 2: loss of 8 cycles.
Assigning STEP to a register is a loss. Output address (Y) is almost always
unaligned.
Timings:
- C (32/64 bits): 117/109 cycles
- SSE: 57 cycles
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The 32bits targets have been compiled with -mfpmath=sse for proper reference.
sbr_sum_square C /32bits: 82c (unrolled)/102c
C /64bits: 69c (unrolled)/82c
SSE/32bits: 42c
SSE/64bits: 31c
Use of SSE4.1 dpps to perform the final sum is slower.
Not unrolling to perform 8 operations in a loop yields 10 more cycles.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Since we are clipping before we shift the values to
16 or 32 bits, we should not shift the min/max clip
values to compensate.
Fixes 8 and 24 bit lossy decoding.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>