Currently it is always 4, but this change will allow it to be adjusted when
bandwidth-related features are added such as channel coupling, enhanced
channel coupling, and spectral extension.
(cherry picked from commit 53e35fd340)
This moves setting the thread count to a minimum of 1 to
frame_thread_init(), allowing a value of zero to propagate
through to the codec if frame threading is not used. This
makes auto-threads work in libx264.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit ff1efc524c)
For intra codecs, ff_thread_finish_setup() is called before decoding starts
automatically. However, get_buffer can only be used before it's called, so
adding this requirement broke frame threading for them. Fixed by moving the
call until after get_buffer is finished.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
(cherry picked from commit ad9791e12b)
The assembler emits literal pools too far from the load instructions,
so we must do it explicitly at a suitable location.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 8b454c352f)
decode_init sets bands[0] == 2, so this loop always sets the band table
index (k) to zero.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
(cherry picked from commit a304def1dc)
This avoids the core substream extensions scan when the EXT_AUDIO_ID
field indicates no extensions or only unsupported extensions. The scan
is done only if the value of EXT_AUDIO_ID is unknown or indicates a
present XCh extension which we can decode.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 7e06e0ede3)
There is no need to expand to 16-bits. Just use memcpy() to copy the raw data.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
(cherry picked from commit 1108f8998c)
This also adds output buffer size checks for AUDIO and SILENCE block types.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
(cherry picked from commit 1574eff3d2)
The size should depend on the output sample size, not the internal bit depth.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
(cherry picked from commit a58bcb40b1)
This allows the values to be used without changing C code and is closer to how
the other DEBUG flags work.
If this causes a problem for any user of this flag, please tell me and
ill split the flag in 2.
The 4-tap filters should only access one row/column before the
reference block.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit e0e46cae37)
GCC 4.3 and later are more particular about signedness matching
in vector operations. The operations under if(rangered) were
missing assignments and thus had no effect.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 381efba0ec)
Merging these functions allows merging some loops, which makes the
results (particularly after SIMD optimizations) much faster.
(cherry picked from commit f8bed30d8b)
Advantage is that it allows us to combine several loops into a single
one, and these can eventually be merged into the IDCT itself. Also, it
allows us to remove vc1_put_block(), and makes CODEC_FLAG_GRAY faster.
(cherry picked from commit bbfd2e7ab4)
Advanced profile never uses "range reduction", so vc1_put_block() quite
literally just calls put_pixels_clamped() from vc1_decode_i_blocks_adv().
By inlining the function, we can prevent calling IDCT8x8 if
CODEC_FLAG_GRAY is set, and we don't have to scale the coeffs in the
[0,256] range, but can instead use put_signed_pixels_clamped().
(cherry picked from commit 70aa916e46)