vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.
Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Support the cases where the first and last operand of
the XOP instruction are the same.
Also add vpmacsdql emulation.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This makes the generated assembly more internally consistent,
avoiding declaring two labels for the same function (for cases
where EXTERN_ASM is empty) and not declaring a separate unprefixed
label in other cases.
This also makes sure the .func and .type delcarations have the same
prefix. They have previously not been used on the platforms
that have prefixed symbols on arm (iOS), but gas-preprocessor
has recently started using the .func declarations for adding
.thumb_func declarations for such functions.
Signed-off-by: Martin Storsjö <martin@martin.st>
Without this a developer would have to add a include every time he
wants to benchmark some code, this is a moderate inconvenience.
This reverts the specific hunk from fb0c9d41d6
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
NEON and VFP are currently mandatory for all ARMv8 profiles. Both are
handled as extensions as far as cpuflags are concerned. This is
consistent with handling x86_64 which always has SSE2, but still
handles it as an extension.
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).
The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.
This restores the original intention, to align the loop entry
points.
Signed-off-by: Martin Storsjö <martin@martin.st>
The new code is faster and reuses the previous state in case of
multiple calls.
The previous code could easily end up in near-infinite loops,
if the difference between two clock() calls never was larger than
1.
This makes fate-parseutils finish in finite time when run in wine,
if CryptGenRandom isn't available (which e.g. isn't available if
targeting Windows RT/metro).
Patch originally by Michael Niedermayer but with some modifications
by Martin Storsjö.
Signed-off-by: Martin Storsjö <martin@martin.st>
The constant may change in libavutil but the library may be compiled
against an older version, thus rejecting a value which is otherwise
supported by the new libavutil.
INT_MAX is used here to denote the max allowed value for a sample format.
The opt-test code is changed to provide a valid reference example.
The constant may change in libavutil but the library may be compiled
against an older version, thus rejecting a value which is otherwise
supported by the new libavutil.
INT_MAX is used here to denote the max allowed value for a pixel format.
The opt-test code is changed to provide a valid reference example.
Previously when setting a pixel/sample format as a string range checks
were not performed. This is consistent with the
av_opt_set_pixel/sample_fmt() interface.
This reverts commit 792845e436, reversing
changes made to 1d6666a6b8.
Bumping libavutil requires all libraries that use libavutil to have their
major version bumped (yes breakage has been confirmed this is not a hypotheses)
One case of breakage is due to new types being added to AVOptions and
applications that linked to old libavutil and libswresample
then trying to use old libavutil (its soname changed so the old isnt updated)
and new swresample (its soame didnt change so it is updated)
the new swresample contains AVOption types that the old libavutil doesnt
know of thus the application attempting to access these avoptions
fails
AVOptions are used by all libs so the issue can potentially happen with
any other lib, libswresample was just the first that showed the problem
ive not checked if the other libs are affected currently by the same issue
or not
Also in addition to AVOptions, AVFrames are also defined in
libavutil, bumping it without all libs that use AVFrames could lead to
serious inconsistencies when 2 libs/app end up using 2 different libavutils
The alternative of bumping all is still possible after this revert, if it
turns out to be the preferred solution
Commit 41578f70cf changed the LLS API, which was
called from libavcodec. Thus using an old libavcodec with a new libavutil will
break.
All scheduled API changes are deferred to the next bump.