Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4
functions for all instructions that exists in a VEX-encoded
version.
This change makes it easier to extend existing code to use AVX2.
Also add support for AVX emulation of a few instructions that
were missing before.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Prevents a crash if the misaligned exception mask bit is
cleared for some reason.
Misaligned SSE functions are only used on AMD Phenom CPUs
and the benefit is miniscule. They also require modifying
the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Store XMM6 and XMM7 in the shadow space in functions that
clobbers them. This way we don't have to adjust the stack
pointer as often, reducing the number of instructions as
well as code size.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.
The implementation involves lots of spurious labels, but that's OK
because we strip them.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Prior to this on msvc/icl there was no handling of deprecated functions
and the deprecated warning was disabled.
After enabling there are a number of warnings relating to the CRT and
the use of the non-secure versions of several functions. Defining
_CRT_SECURE_NO_WARNINGS silences these warnings.
Signed-off-by: Martin Storsjö <martin@martin.st>
When compiling with --enable-small, ripemd.o will weigh a few kilobytes less than
it used to before the previous commit.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The benchmark tests the speed of the following algorithms:
MD5, SHA-1, SHA-256, SHA-512, RIPEMD-160, AES-128.
It can optionally be built to perform the same benchmark on
other crypto libraries, for comparison purposes.
The supported libraries are:
- crypto: OpenSSL's libcrypto;
- gcrypt: GnuTLS's libgcrypt;
- tomcrypt: LibTomCrypt
To enable them, use this syntax:
make VERSUS=crypto+gcrypt+tomcrypt tools/crypto_bench
They do not need to have been enabled in configure.
The pixel format descriptors are set to more or less arbitrary
values as bayer formats do not fit in the descriptors structure.
These values are currently not used for bayer formats and thus
do not matter.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>