About 30% faster on 32 bit Atom, 120% faster on 64 bit Phenom2.
This is interesting because supporting P16 is easier in e.g.
OpenGL (can misuse support for any 2-component 8 bit format),
whereas supporting p9/p10 without conversion needs a texture
format with at least 14 bits actual precision.
The shiftonly == 0 case is not optimized since the code is more
complex and the speed gain less obvious.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Fixes problems where rgbToRgbWrapper() is called even though it doesn't
support this particular conversion (e.g. converting from RGB444 to
anything). Thirdly, fixes issues where rgbToRgbWrapper() is called for
non-native endiannness conversions (e.g. RGB555BE on a LE system).
Fourthly, fixes crashes when converting from e.g. monowhite to
monowhite, which calls planarCopyWrapper() and overwrites/reads because
n_bytes != n_pixels.
Although gcc guarantees 16 byte stack alignment, threads under WinXP
don't appear to be guaranteed to start stack aligned. So fix the
alignment.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
libswscale/swscale_unscaled.c:915:9: warning: new qualifiers in middle of multi-level non-const cast are unsafe
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
libswscale/swscale_unscaled.c:805:5: warning: passing argument 1 of ‘check_image_pointers’ from incompatible pointer type
libswscale/swscale_unscaled.c:774:12: note: expected ‘uint8_t **’ but argument is of type ‘const uint8_t * const*’
libswscale/swscale_unscaled.c:809:5: warning: passing argument 1 of ‘check_image_pointers’ discards qualifiers from pointer target type
libswscale/swscale_unscaled.c:774:12: note: expected ‘uint8_t **’ but argument is of type ‘uint8_t * const*’
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
When converting RGB format to RGB format with the same bits per sample,
unscaled path performs conversion on the whole buffer at once. For
non-multiple-of-16 BGR24 to RGB24 conversion it means that padding at the
end of line will be converted too. Since it may be of arbitrary length
(e.g. 8 bytes), operating on the whole buffer produces obviously wrong
results.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>