This is essential for fast SIMD accesses.
The same should be done with the predict output.
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
19% faster
smaller files
this may also fix possible integer overflows due to previous 32bit useage
Tested with libutvideo and our utvideo decoder, this patch does not change
decoder output in the test
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
In order to send or receive a stream FCPublish, FCSubscribe and _checkbw
are completely optional and often not implemented. releaseStream over a
non-existen stream might report an error instead of being silent.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
GCC 4.3 and later do the right thing with the plain C code. Earlier
versions in 32-bit mode generate one extra instruction, needlessly
zeroing what would be the high half of the shifted value. At least
two gcc configurations miscompile the inline asm in some situations.
In 64-bit mode, all gcc versions generate imul r64, r64 followed by
shr. On Intel i7 and later, this imul is faster 32-bit mul. On
older Intel and all AMD, it is slightly slower. On Atom it is much
slower.
Considering where the FASTDIV macro is used, any overall negative
performance impact of this change should be negligible. If anyone
cares, they should file a bug against gcc and get the instruction
selection fixed.
Signed-off-by: Mans Rullgard <mans@mansr.com>
* qatar/master:
build: x86: Only compile mpegvideo optimizations when necessary
configure: Drop fastdiv option
build: Make the E-AC-3 encoder select the AC-3 encoder
fate: flac: Only run tests requiring samples when samples are available
Conflicts:
configure
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fix warnings:
muxing.c: In function ‘write_video_frame’:
muxing.c:326:23: warning: passing argument 2 of ‘sws_scale’ from incompatible pointer type [enabled by default]
There is no point in having the user disable any fastdiv macros.
Besides the condition implementation was broken and only disabled
the C implementation, but no platform specific assembly versions.
Pass pointer to sample buffer instead of channel number to various
functions called from decode_subframe(). Also simplify a few
expressions within this function.
The failures on various architectures and compilers on the RGB(A)
tests seem to have been because of one-off YCbCr->RGB conversion
results. This should make the conversion results match on most if
not all code paths.
Signed-off-by: Anton Khirnov <anton@khirnov.net>