Altivec can only load naturally aligned vectors. To handle possibly
unaligned data a second vector is loaded from an offset of the original
location and the data is recovered through a vector permutation.
Overreads are minimal if the offset for second load points to the last
element of data. This is 7 for loading eight 8-bit pixels and overreads
are reduced from 16 bytes to 8 bytes if the pixels are 64-bit aligned.
For unaligned pixels the overread is reduced from 23 bytes to 15 bytes
in the worst case.
To load unaligned vector data in the usual way, explicit vec_ld()
should be used rather than dereferencing a pointer to a vector type.
When the VSX extension is enabled, gcc may compile vector pointer
dereferences using the VSX lxvw4x instruction instead of the lvx
instruction typically used with Altivec/VMX. As the behaviour of
these instructions with unaligned addresses differs, it is important
that only lvx is used here.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Storing a single element from a vector where all elements have the same
value does not require an aligned destination. Which element is stored
depends on the alignment of the destination address, but since they all
have the same value, the result is the same regardless of the alignment.
Originally committed as revision 19696 to svn://svn.ffmpeg.org/ffmpeg/trunk
The original problem was that FSF and Apple gcc used a different syntax
for vector declarations, i.e. {} vs. (). Nowadays Apple gcc versions support
the standard {} syntax and versions that support {} are available on all
relevant Mac OS X versions. Thus the greater compatibility is no longer
worth cluttering the code with macros.
Originally committed as revision 14366 to svn://svn.ffmpeg.org/ffmpeg/trunk
This includes indentation changes, comment reformatting, consistent brace
placement and some prettyprinting.
Originally committed as revision 14318 to svn://svn.ffmpeg.org/ffmpeg/trunk