The latest version of ACLE splits __ARM_FEATURE_CRYPTO into two defines
to reflect that, starting ARMv8.2, the cryptography extension can
include {AES,PMULL} and {SHA1,SHA256} separately.
Also standardize on __ARM_NEON, which is the recommended symbol from
ACLE, and the only one defined on non-Apple aarch64 targets. Digging
through GCC history, __ARM_NEON__ is a bit older. __ARM_NEON was added
in GCC's 9e94a7fc5ab770928b9e6a2b74e292d35b4c94da from 2012, part of GCC
4.8.0.
I suspect we can stop paying attention to __ARM_NEON__ at this point,
but I've left both working for now. __ARM_FEATURE_{AES,SHA2} is definite
too new to fully replace __ARM_FEATURE_CRYPTO.
Tested on Linux that -march=armv8-a+aes now also drops the fallback AES
code. Previously, we would pick up -march=armv8-a+crypto, but not
-march=armv8-a+aes. Also tested that, on an OPENSSL_STATIC_ARMCAP build,
-march=armv8-a+sha2 sets the SHA-1 and SHA-256 features.
Change-Id: I749bdbc501ba2da23177ddb823547efcd77e5c98
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50847
Reviewed-by: Adam Langley <agl@google.com>
These symbols were not marked OPENSSL_EXPORT, so they weren't really
usable externally anyway. They're also very sensitive to various build
configuration toggles, which don't always get reflected into projects
that include our headers. Move them to crypto/internal.h.
Change-Id: I79a1fcf0b24e398d75a9cc6473bae28ec85cb835
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50846
Reviewed-by: Adam Langley <agl@google.com>
The stack consumption of the HRSS functions is causing issues in
stack-constrained environments. Therefore allocate many variables on the
heap. This means that several HRSS_ functions now allocate, and thus can
fail, where they couldn't before. Callers that ignore the return value
and don't have crash-on-failure mallocs will still be safe, although
things will fail to decrypt later on.
Somehow, this actually makes key generation _faster_ on my machine. (I
don't know. Better alignment? Fewer L1 collisions?) The other operations
are slightly slower, as expected.
Before:
Did 17390 HRSS generate operations in 3054088us (5694.0 ops/sec)
Did 225000 HRSS encap operations in 3000512us (74987.2 ops/sec)
Did 87000 HRSS decap operations in 3014525us (28860.3 ops/sec)
After:
Did 21300 HRSS generate operations in 3026637us (7037.5 ops/sec)
Did 221000 HRSS encap operations in 3008911us (73448.5 ops/sec)
Did 84000 HRSS decap operations in 3007622us (27929.0 ops/sec)
Change-Id: I2312df8909af7d8d250c7c483c65038123f21ad9
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/48345
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Also use a slightly more conservative pattern. Instead of aligning the
pointer as a uintptr_t and casting back, compute the offset and advance
in pointer space. C guarantees that casting from pointer to uintptr_t
and back gives the same pointer, but general integer-to-pointer
conversions are generally implementation-defined. GCC does define it in
the useful way, but this makes fewer dependencies.
Change-Id: I70c7af735e892fe7a8333b78b39d7b1f3f1cdbef
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/48405
Reviewed-by: Adam Langley <alangley@gmail.com>