strcasecmp is locale-sensitive, which can cause some mishaps.
This CL should be a no-op, because this call is only used on Android,
and bionic's strcasecmp seems to be ASCII-only. But using
OPENSSL_strcasecmp everywhere is easier to reason about.
Change-Id: Iecf9bc4da1bb3a4ab87b1e8b1d7f6f6c6e44aceb
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52305
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
The ARMv8 assembly code in this commit is mostly taken from OpenSSL's `ecp_nistz256-armv8.pl` at 19e277dd19/crypto/ec/asm/ecp_nistz256-armv8.pl (see Note 1), adapting it to the implementation in p256-x86_64.c.
Most of the assembly functions found in `crypto/fipsmodule/ec/asm/p256-x86_64-asm.pl` required to support that code have their analogous functions in the imported OpenSSL ARMv8 Perl assembly implementation with the exception of the functions:
- ecp_nistz256_select_w5
- ecp_nistz256_select_w7
An implementation for these functions was added.
Summary of modifications to the imported code:
* Renamed to `p256-armv8-asm.pl`
* Modified the location of `arm-xlate.pl` and `arm_arch.h`
* Replaced the `scatter-gather subroutines` with `select subroutines`. The `select subroutines` are implemented for ARMv8 similarly to their x86_64 counterparts, `ecp_nistz256_select_w5` and `ecp_nistz256_select_w7`.
* `ecp_nistz256_add` is removed because it was conflicting during the static build with the function of the same name in p256-nistz.c. The latter calls another assembly function, `ecp_nistz256_point_add`.
* `__ecp_nistz256_add` renamed to `__ecp_nistz256_add_to` to avoid the conflict with the function `ecp_nistz256_add` during the static build.
* l. 924 `add sp,sp,#256` the calculation of the constant, 32*(12-4), is not left for the assembler to perform.
Other modifications:
* `beeu_mod_inverse_vartime()` was implemented for AArch64 in `p256_beeu-armv8-asm.pl` similarly to its implementation in `p256_beeu-x86_64-asm.pl`.
* The files containing `p256-x86_64` in their name were renamed to, `p256-nistz` since the functions and tests defined in them are hereby running on ARMv8 as well, if enabled.
* Updated `delocate.go` and `delocate.peg` to handle the offset calculation in the assembly instructions.
* Regenerated `delocate.peg.go`.
Notes:
1- The last commit in the history of the file is in master only, the previous commits are in OpenSSL 3.0.1
2- This change focuses on AArch64 (64-bit architecture of ARMv8). It does not support ARMv4 or ARMv7.
Testing the performance on Armv8 platform using -DCMAKE_BUILD_TYPE=Release:
Before:
```
Did 2596 ECDH P-256 operations in 1093956us (2373.0 ops/sec)
Did 6996 ECDSA P-256 signing operations in 1044630us (6697.1 ops/sec)
Did 2970 ECDSA P-256 verify operations in 1084848us (2737.7 ops/sec)
```
After:
```
Did 6699 ECDH P-256 operations in 1091684us (6136.4 ops/sec)
Did 20000 ECDSA P-256 signing operations in 1012944us (19744.4 ops/sec)
Did 7051 ECDSA P-256 verify operations in 1060000us (6651.9 ops/sec)
```
Change-Id: I9fdef12db365967a9264b5b32c07967b55ea48bd
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51805
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
There are paperwork reasons why it's useful to use the same hash
function in all cases. Thus unify on SHA-256 because contexts where
SHA-512 is faster, are faster overall and thus less sensitive.
Change-Id: I7a782a3adba4ace3257313a24dc8bc213b9d64ec
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52165
Reviewed-by: David Benjamin <davidben@google.com>
The files no longer need to be patched because fiat-crypto now has its
own copy of our value barrier. It does, however, require syncing our
NO_ASM define with fiat's.
fiat-crypto is now licensed under any of MIT, BSD 1-clause, or Apache 2.
I've stuck with the MIT one as that's what we were previously importing.
No measurable perf difference before/after this CL, with GCC or Clang on
x86_64.
Change-Id: I2939fd517de37aabdea3ead49150135200a1b112
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52045
Reviewed-by: Adam Langley <agl@google.com>
Our FIPS module only claims support for RSA signing/verification, and
|RSA_generate_key_fips| already performs a sign/verify pair-wise
consistency test (PCT). For ECDSA, |EC_KEY_generate_fips| performs a
sign/verify PCT too. But when |EC_KEY_generate_fips| is used for key
agreement a sign/verify PCT may not be correct.
The FIPS IG[1], page 60, says:
> Though not a CAST, a pairwise consistency test (PCT) shall be
> conducted for every generated public and private key pair for the
> applicable approved algorithm (per ISO/IEC 19790:2012 Section
> 7.10.3.3). To further clarify, at minimum, the PCT that is required by
> the underlying algorithm standard (e.g. SP 800- 56Arev3 or SP
> 800-56Brev2) shall be performed.
SP 800-56Ar3, page 36, says:
> For an ECC key pair (d, Q): Use the private key, d, along with the
> generator G and other domain parameters associated with the key pair,
> to compute dG (according to the rules of elliptic-curve arithmetic).
> Compare the result to the public key, Q. If dG is not equal to Q, then
> the pair-wise consistency test fails
But |EC_KEY_generate_fips| has always done that via
|EC_KEY_check_key|. So I believe that |EC_KEY_generate_fips| works for
either case.
This change documents that.
[1] FIPS 140-3 IG dated 2022-03-14 and with SHA-256
2f232f7f5839e3263284d71c35771c9fdf2e505b02813be999377030c56b37e4
Change-Id: I4b4e2ed92ae3d59e2f2404c41694abeb3eb283f4
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51988
Reviewed-by: David Benjamin <davidben@google.com>
We need a function that returns a version that links to a certificate.
Previously we have used the git hash as the version of our modules but
the source cannot contain its own hash. Thus this change defines a new
format for FIPS module versions which will be filled in once we're ready
to define a version.
Change-Id: Ie4641945119106bc47e8da94ed8a45a86abb6f92
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51986
Reviewed-by: David Benjamin <davidben@google.com>
BN_mod_sqrt implements the Tonelli–Shanks algorithm, which requires a
prime modulus. It was written such that, given a composite modulus, it
would sometimes loop forever. This change fixes the algorithm to always
terminate. However, callers must still pass a prime modulus for the
function to have a defined output.
In OpenSSL, this loop resulted in a DoS vulnerability, CVE-2022-0778.
BoringSSL is mostly unaffected by this. In particular, this case is not
reachable in BoringSSL from certificate and other ASN.1 elliptic curve
parsing code. Any impact in BoringSSL is limited to:
- Callers of EC_GROUP_new_curve_GFp that take untrusted curve parameters
- Callers of BN_mod_sqrt that take untrusted moduli
This CL updates documentation of those functions to clarify that callers
should not pass attacker-controlled values. Even with the infinite loop
fixed, doing so breaks preconditions and will give undefined output.
Change-Id: I64dc1220aaaaafedba02d2ac0e4232a3a0648160
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51925
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: Martin Kreichgauer <martinkr@google.com>
Commit-Queue: Adam Langley <agl@google.com>
NIST publishes the PDFs of the security policy documents (although the
latest one is still missing). We include the docx sources to help others
who might be doing a rebrand certification of BoringCrypto.
Change-Id: I5c1511d53ec1d09d257d3aab1301486c364b660b
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51505
Reviewed-by: David Benjamin <davidben@google.com>
On Arm, our CRYPTO_is_*_capable functions check the corresponding
preprocessor symbol. This allows us to automatically drop dynamic checks
and fallback code when some capability is always avilable.
This CL does the same on x86, as well as consolidates our
OPENSSL_ia32cap_P checks in one place. Since this abstraction is
incompatible with some optimizations we do around OPENSSL_ia32cap_get()
in the FIPS module, I've marked the symbol __attribute__((const)), which
is enough to make GCC and Clang do the optimizations for us. (We already
do the same to DEFINE_BSS_GET.)
Most x86 platforms support a much wider range of capabilities, so this
is usually a no-op. But, notably, all x86_64 Mac hardware has SSSE3
available, so this allows us to statically drop an AES implementation.
(On macOS with -Wl,-dead_strip, this seems to trim 35080 bytes from the
bssl binary.) Configs like -march=native can also drop a bunch of code.
Update-Note: This CL may break build environments that incorrectly mark
some instruction as statically available. This is unlikely to happen
with vector instructions like AVX, where the compiler could freely emit
them anyway. However, instructions like AES-NI might be set incorrectly.
Change-Id: I44fd715c9887d3fda7cb4519c03bee4d4f2c7ea6
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51548
Reviewed-by: Adam Langley <agl@google.com>
x86_64-mont5.pl checks for both BMI1 and BMI2, because the MULX path
also uses the ANDN instruction. Some history here from upstream:
a5bb5bca52f57021a4017521c55a6b3590bbba7a, dated 2013-10-03, added the
MULX path to x86_64-mont5.pl. At the time, the cpuid check was
BMI2+ADX. (MULX comes from BMI2.)
37de2b5c1e370b493932552556940eb89922b027, dated 2013-10-09, made
BN_mod_exp_mont_consttime prefer the MULX mont5 code over the AVX2 rsaz
code, with a matching BMI2+ADX cpuid check.
8fc8f486f7fa098c9fbb6a6ae399e3c6856e0d87, dated 2016-01-25, tweaked some
code to use the ANDN instruction, from BMI1. Correspondingly, it changed
the cpuid check to be BMI1+BMI2+ADX. The BN_mod_exp_mont_consttime check
was left unchanged.
This CL fixes our version of the BN_mod_exp_mont_consttime check to
match the assembly, by also checking BMI1. (This should be a no-op.
Presumably any processor with BMI2 also has BMI1.)
Change-Id: Ib0cacc7e2be840d970460eef4dd9ded7fb24231c
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51547
Reviewed-by: Adam Langley <agl@google.com>
While our CI machines don't have these instructions, Intel SDE covers
them. Benchmarks on an AMD EPYC machine (VM on Google Compute Engine):
Before:
Did 13619000 SHA-256 (16 bytes) operations in 3000147us (72.6 MB/sec)
Did 3728000 SHA-256 (256 bytes) operations in 3000566us (318.1 MB/sec)
Did 920000 SHA-256 (1350 bytes) operations in 3002829us (413.6 MB/sec)
Did 161000 SHA-256 (8192 bytes) operations in 3017473us (437.1 MB/sec)
Did 81000 SHA-256 (16384 bytes) operations in 3029284us (438.1 MB/sec)
After:
Did 25442000 SHA-256 (16 bytes) operations in 3000010us (135.7 MB/sec) [+86.8%]
Did 10706000 SHA-256 (256 bytes) operations in 3000171us (913.5 MB/sec) [+187.2%]
Did 3119000 SHA-256 (1350 bytes) operations in 3000470us (1403.3 MB/sec) [+239.3%]
Did 572000 SHA-256 (8192 bytes) operations in 3001226us (1561.3 MB/sec) [+257.2%]
Did 289000 SHA-256 (16384 bytes) operations in 3006936us (1574.7 MB/sec) [+259.4%]
Although we don't currently have unwind tests in CI, I ran the unwind
tests manually on the same VM. They pass, after adding in the missing
.cfi_startproc and .cfi_endproc lines.
Change-Id: I45b91819e7dcc31e63813843129afa146d0c9d47
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51546
Reviewed-by: Adam Langley <agl@google.com>
fips_break_test.h is a bad name because generate_build_files.py thinks
that it's a test file, which it is, but one that's needed in the main
build. Thanks to Svilen Kanev for noting this.
That header doesn't particularly carry its weight. The idea was that
rebuilding the break test wouldn't need to rebuild everything if that
logic was isolated in its own header. But we only have to rebuild once
now, so whatever. There's already a block of crypto/internal.h with very
similar stuff; it can go there.
Change-Id: Ifb479eafd4df9a7aac4804cae06ba87257c77fc3
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51485
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Change-Id: I6dabeb0a9090a4ddcafc88a3bc53b2c28c30f14a
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51465
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Change-Id: Ief328bb2a8b6264226a89233c9fba0e4621de9d7
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51425
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
All FIPS testing is done with ACVP now. We can delete all the CAVP
stuff.
Change-Id: I459873474e40b0371f9cf760090a130ef9a90a8c
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51330
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
FIPS validation requires showing that the continuous and start-up tests
are effective by breaking them. Traditionally BoringSSL used #defines
that tweaked the expected values. However, 140-3 now requires that the
inputs be changed, not the expected outputs.
Also, the number of tests is going to increase. Since slower platforms
already took too long to compile BoringSSL n times (once for each test
to break) we want something faster too.
Therefore all the known-answer tests (KATs) are changed such that a Go
program can find and replace the input value in order to break them.
Thus we only need to recompile once to disable the integrity test.
The runtime tests still need a #define to break, but that #define is now
put in a header file so that only the module need be recompiled, not
everything as in the previous system.
Change-Id: Ib621198e6ad02253e29af0ccd978e3c3830ad54c
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51329
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Builds that compile the FIPS stuff separately don't get this header from
other files.
Change-Id: I8a1b30ae360b08d4f4b9f804cd234998889477bc
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51405
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
AS10.20 requires that the self-test for the integrity algorithm pass
before the integrity check itself. IG 10.3.A requires an HMAC self-test
now. Therefore run these tests before the integrity check.
Since we also need the ability to run all self-tests, both SHA
self-tests and the HMAC test are run again when running self-tests.
I'm assuming that they're so fast that it doesn't matter.
Change-Id: I6b23b6fd3cb6107edd7420bc8680780719bd41d2
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51328
Reviewed-by: David Benjamin <davidben@google.com>
The word “calculated” is two letters longer than “expected” and it's
nice to line up the ouptuts.
Change-Id: Idac70e62d98fbe26c430f03f4643ba295e40853d
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51327
Reviewed-by: David Benjamin <davidben@google.com>
The provision of FIPS that allowed the tests to be skipped based on a
flag-file has been removed in 140-3. Therefore we expect to run the fast
KATs on start-up, but to defer to slower ones until the functionality in
question is first used. So this change splits off the fast KATs and
removes support for skipping KATs based on a flag-file.
Change-Id: Ib24cb1739cfef93e4a1349d786a0257ee1083cfb
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51326
Reviewed-by: David Benjamin <davidben@google.com>
This matches our other free functions.
Fixed: 473
Change-Id: Ie147995c2f5b429f78e95cfc9a08ed54181af94e
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51005
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
140-3 says
> the zeroisation of protected and unprotected SSPs
> shall be performed in the following scenarios:
> ...
> For temporary value(s) generated during the integrity test of the
> module’s software or firmware upon completion of the integrity test.
(IG 9.7.B)
Change-Id: I911f294860bf33b13b2c997fc633c9bda777fc48
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50945
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
GCC's __ARMEL__ and __ARMEB__ defines denote little- and big-endian arm,
respectively. They are not defined on aarch64, which instead use
__AARCH64EL__ and __AARCH64EB__.
However, OpenSSL's assembly originally used the 32-bit defines on both
platforms and even define __ARMEL__ and __ARMEB__ in arm_arch.h. This is
less portable and can even interfere with other headers, which use
__ARMEL__ to detect little-endian arm. (Our own base.h believes
__ARMEL__ implies 32-bit arm. We just happen to check __AARCH64EL__
first. base.h is probably also always included before arm_arch.h.)
Over time, the aarch64 assembly has switched to the correct defines,
such as in 32bbb62ea634239e7cb91d6450ba23517082bab6. This commit
finishes the job.
(There is an even more official endianness detector, __ARM_BIG_ENDIAN in
the Arm C Language Extensions. But I've stuck with the GCC ones here as
that would be a larger change.)
See also https://github.com/openssl/openssl/pull/17373
Change-Id: Ic04ff85782e6599cdeaeb33d12c2fa8edc882224
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50848
Reviewed-by: Adam Langley <agl@google.com>
These symbols were not marked OPENSSL_EXPORT, so they weren't really
usable externally anyway. They're also very sensitive to various build
configuration toggles, which don't always get reflected into projects
that include our headers. Move them to crypto/internal.h.
Change-Id: I79a1fcf0b24e398d75a9cc6473bae28ec85cb835
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50846
Reviewed-by: Adam Langley <agl@google.com>
This imports the changes to sha512-armv8.pl from
upstream's af0fcf7b4668218b24d9250b95e0b96939ccb4d1.
Tweaks needed:
- Add an explicit .text because we put .LK$BITS in .rodata for XOM
- .LK$bits and code are in separate sections, so use adrp/add instead of
plain adr
- Where glibc needs feature flags to *enable* pthread_rwlock, Apple
interprets _XOPEN_SOURCE as a request to *disable* Apple extensions.
Tighten the condition on the _XOPEN_SOURCE check.
Added support for macOS and Linux, tested manually on an ARM Mac and a
VM, respectively. Fuchsia and Windows do not currently have APIs to
expose this bit, so I've left in TODOs. Benchmarks from an Apple M1 Max:
Before:
Did 4647000 SHA-512 (16 bytes) operations in 1000103us (74.3 MB/sec)
Did 1614000 SHA-512 (256 bytes) operations in 1000379us (413.0 MB/sec)
Did 439000 SHA-512 (1350 bytes) operations in 1001694us (591.6 MB/sec)
Did 76000 SHA-512 (8192 bytes) operations in 1011821us (615.3 MB/sec)
Did 39000 SHA-512 (16384 bytes) operations in 1024311us (623.8 MB/sec)
After:
Did 10369000 SHA-512 (16 bytes) operations in 1000088us (165.9 MB/sec) [+123.1%]
Did 3650000 SHA-512 (256 bytes) operations in 1000079us (934.3 MB/sec) [+126.2%]
Did 1029000 SHA-512 (1350 bytes) operations in 1000521us (1388.4 MB/sec) [+134.7%]
Did 175000 SHA-512 (8192 bytes) operations in 1001874us (1430.9 MB/sec) [+132.5%]
Did 89000 SHA-512 (16384 bytes) operations in 1010314us (1443.3 MB/sec) [+131.4%]
(This doesn't seem to change the overall SHA-256 vs SHA-512 performance
question on ARM, when hashing perf matters. SHA-256 on the same chip
gets up to 2454.6 MB/s.)
In terms of build coverage, for now, we'll have build coverage
everywhere and test coverage on Chromium, which runs this code on macOS
CI. We should request a macOS ARM64 bot for our standalone CI. Longer
term, we need a QEMU-based builder to test various features. QEMU seems
to have pretty good coverage of all this, which will at least give us
Linux.
I haven't added an OPENSSL_STATIC_ARMCAP_SHA512 for now. Instead, we
just look at the standard __ARM_FEATURE_SHA512 define. Strangely, the
corresponding -march tag is not sha512. Neither GCC and nor Clang have
-march=armv8-a+sha512. Instead, -march=armv8-a+sha3 implies both
__ARM_FEATURE_SHA3 and __ARM_FEATURE_SHA512! Yet everything else seems
to describe the SHA512 extension as separate from SHA3.
https://developer.arm.com/architectures/system-architectures/software-standards/acle
Update-Note: Consumers with a different build setup may need to
limit -D_XOPEN_SOURCE=700 to Linux or non-Apple platforms. Otherwise,
<sys/types.h> won't define some typedef needed by <sys/sysctl.h>. If you
see a build error about u_char, etc., being undefined in some system
header, that is probably the cause.
Change-Id: Ia213d3796b84c71b7966bb68e0aec92e5d7d26f0
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50807
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
This imports 753316232243ccbf86b96c1c51ffcb41651d9ad5,
46f4e1bec51dc96fa275c168752aa34359d9ee51, and
32bbb62ea634239e7cb91d6450ba23517082bab6.
The last commit fixes a detection of big-endian aarch64 in the kernel,
which we do not support at all, but is imported to reduce the upstream
diff. Though it points out a messy part of arm_arch.h: __ARMEL__ and
__ARMEB__ are specific to 32-bit ARM. __AARCH64EB__ and __AARCH64EL__
are the 64-bit ones. But OpenSSL's arm_arch.h defines __ARME[LB]__ for
aarch64 and uses it in perlasm. We should fix the files upstream to
look at the aarch64 ones. (Indeed our own base.h assumes __ARMEL__
implies 32-bit ARM.)
Change-Id: I6c2241e103a97e8c3599cdfa43dcc6f30d4a2581
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50806
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
We currently have two aarch64 SHA-256 implementations: one using
general-purpose registers and one using the SHA-256 extensions.
Upstream's 866e505e0d663158b0fe63a7fb7455eebacc6470 added a NEON
version.
This CL syncs the transforms at the bottom of the file, to avoid
potential mistranslations in future imports. It doesn't change the
output for our current assembly.
Skips the NEON implementation itself for now. It only helps
processors without SHA-256 instructions. While Android does not
actually mandate the cryptography extensions on ARMv8, most devices
have it.
Additionally, this file does CPU dispatch in assembly, without taking
advantage of static information. We'd end up shipping both fallback
SHA-256 implementations. This is particularly silly because NEON is
mandatory in ARMv8-A anyway. (Does anyone build us on -R or -M? Probably
not?)
(If we later have a reason to import it, the binary size cost isn't that
significant. Moreover, the NEON fallback is actually slightly smaller
than the non-NEON fallback, so if we move CPU dispatch to C, importing
may even be worthwhile.)
Change-Id: I3c8ca6e77e4e6d1299f975c407cbcf4c9c240523
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50805
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
GCC has a warning that complains about even more type mismatches in
printf. Some of these are a bit messy and will be fixed in separate CLs.
This covers the easy ones.
The .*s stuff is unfortunate, but printf has no size_t-clean string
printer. ALPN protocol lengths are bound by uint8_t, so it doesn't
really matter.
The IPv6 printing one is obnoxious and arguably a false positive. It's
really a C language flaw: all types smaller than int get converted to
int when you do arithmetic. So something like this first doesn't
overflow the shift because it computes over int, but then the result
overall is stored as an int.
uint8_t a, b;
(a << 8) | b
On the one hand, this fixes a "missing" cast to uint16_t before the
shift. At the same time, the incorrect final type means passing it to
%x, which expects unsigned int. The compiler has forgotten this value
actually fits in uint16_t and flags a warning. Mitigate this by storing
in a uint16_t first.
The story doesn't quite end here. Arguments passed to variadic functions
go through integer promotion[0], so the argument is still passed to
snprintf as an int! But then va_arg allows for a signedness mismatch[1],
provided the value is representable in both types. The combination means
that %x, though actually paired with unsigned, also accept uint8_t and
uint16_t, because those are guaranteed to promote to an int that meets
[1]. GCC recognizes [1] applies here.
(There's also PRI16x, but that's a bit tedious to use and, in glibc, is
defined as plain "x" anyway.)
[0] https://en.cppreference.com/w/c/language/conversion#Default_argument_promotions
[1] https://en.cppreference.com/w/c/variadic/va_arg
Bug: 450
Change-Id: Ic1d41356755a18ab922956dd2e07b560470341f4
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50765
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
OpenSSL 1.1.0 made this structure opaque. I don't think we particularly
need to make it opaque, but external code uses it. Also add
RSA_test_flags.
Change-Id: I136d38e72ec4664c78f4d1720ec691f5760090c1
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50605
Reviewed-by: Adam Langley <agl@google.com>
The non-_ex EVP_CIPHER_CTX Final functions are a bit interesting. Unlike
EVP_DigestFinal(_ex), where the non-_ex version calls EVP_MD_CTX_cleanup
for you, the EVP_CIPHER_CTX ones do not automatically cleanup.
EVP_CipherFinal and EVP_CipherFinal_ex are identical in all releases
where they exist.
This appears to date to OpenSSL 0.9.7:
Prior to OpenSSL 0.9.7, EVP_MD_CTX and EVP_CIPHER_CTX did not use void*
data fields. Instead, they just had a union of context structures for
every algorithm OpenSSL implemented.
EVP_MD_CTX was truly cleanup-less. There were no EVP_MD_CTX_init or
EVP_MD_CTX_cleanup functions at all. EVP_DigestInit filled things in
without reference to the previous state. EVP_DigestFinal didn't cleanup
because there was nothing to cleanup.
EVP_CIPHER_CTX was also a union, but for some reason did include
EVP_CIPHER_CTX_init and EVP_CIPHER_CTX_cleanup. EVP_CIPHER_CTX_init
seemed to be optional: EVP_CipherInit with non-NULL EVP_CIPHER similarly
didn't reference the previous state. EVP_CipherFinal did not call
EVP_CIPHER_CTX_cleanup, but EVP_CIPHER_CTX_cleanup didn't do anything.
It called an optional cleanup hook on the EVP_CIPHER, but as far as I
can tell, no EVP_CIPHER implemented it.
Then OpenSSL 0.9.7 introduced ENGINE. The union didn't work anymore, so
EVP_MD_CTX and EVP_CIPHER_CTX contained void* with allocated
type-specific data. The introduced EVP_MD_CTX_init and
EVP_MD_CTX_cleanup. For (imperfect!) backwards compatibility,
EVP_DigestInit and EVP_DigestFinal transparently called init/cleanup for
you. EVP_DigestInit_ex and EVP_DigestFinal_ex became the more flexible
versions that left init/cleanup to the caller.
EVP_CIPHER_CTX got the same treatment with
EVP_CipherInit/EVP_CipherInit_ex, but *not*
EVP_CipherFinal/EVP_CipherFinal_ex. The latter did the same thing. The
history seems to be that 581f1c84940d77451c2592e9fa470893f6c3c3eb
introduced the Final/Final_ex split, with the former doing an
auto-cleanup, then 544a2aea4ba1fad76f0802fb70d92a5a8e6ad85a undid it.
Looks like the motivation is that EVP_CIPHER_CTX objects are often
reused to do multiple operations with a single key. But they missed that
the split functions are now unnecessary.
Amusingly, OpenSSL's documentation incorrectly said that EVP_CipherFinal
cleaned up after the call until it was fixed in
538860a3ce0b9fd142a7f1a62e597cccb74475d3. The fix says that some
releases cleaned up, but there were, as far as I can tell, no actual
releases with that behavior.
I've put the new Final functions in the deprecated section, purely
because there is no sense in recommending two different versions of the
same function to users, and Final_ex seems to be more popular. But there
isn't actually anything wrong with plain Final.
Change-Id: Ic2bfda48fdcf30f292141add8c5f745348036852
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50485
Reviewed-by: Adam Langley <agl@google.com>
Although the compiler will hopefully optimize it out, this is
technically a VLA. The new Android NDK now warns about this.
Change-Id: Ib9f38dc73c40e90ab61105f29a635c453f1477a1
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50185
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
We have a ton of per-file rotation functions, often with generic names
that do not tell you whether they are uint32_t vs uint64_t, or rotl vs
rotr.
Additionally, (x >> r) | (x << (32 - r)) is UB at r = 0.
(x >> r) | (x << ((-r) & 31)) works for 0 <= r < 32, which is what
cast.c does. GCC and Clang recognize this pattern as a rotate, but MSVC
doesn't. MSVC does, however, provide functions for this.
We usually rotate by a non-zero constant, which makes this moot, but
rotation comes up often enough that it's worth extracting out. Some
particular changes to call out:
- I've switched sha256.c from rotl to rotr. There was a comment
explaining why it differed from the specification. Now that we have
both functions, it's simpler to just match the specification.
- I've dropped all the inline assembly from sha512.c. Compilers should
be able to recognize rotations in 2021.
Change-Id: Ia1030e8bfe94dad92514ed1c28777447c48b82f9
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49765
Reviewed-by: Adam Langley <agl@google.com>
There is an obvious bug there: upon entry to 'vpaes_cbc_encrypt'
LR may get signed. However, on the 'cbc_abort' path the LR is
not going to be unsigned before 'ret' is executed.
Found by manual code inspection.
Co-authored-by: Russ Butler <russ.butler@arm.com>
Change-Id: I646cdfaee28db59aafbbd412d4bb6ba022eff15b
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49605
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
GCC 11.2.1 reportedly warns that CTR_DRBG_init may be passed an
uninitialized personalization buffer. This appears to be a false
positive, because personalization_len will be zero. But it's easy enough
to zero-initialize it, so silence the warning.
Bug: 432
Change-Id: I20f6b74e09f19962e8cae37d45090ff3d1c0215d
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49245
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
The bulk of RSA_check_key is spent in bn_div_consttime, which is a naive
but constant-time long-division algorithm for the few places that divide
by a secret even divisor: RSA keygen and RSA import. RSA import is
somewhat performance-sensitive, so pick some low-hanging fruit:
The main observation is that, in all but one call site, the bit width of
the divisor is public. That means, for an N-bit divisor, we can skip the
first N-1 iterations of long division because an N-1-bit remainder
cannot exceed the N-bit divisor.
One minor nuisance is bn_lcm_consttime, used in RSA keygen has a case
that does *not* have a public bit width. Apply the optimization there
would leak information. I've implemented this as an optional public
lower bound on num_bits(divisor), which all but that call fills in.
Before:
Did 5060 RSA 2048 private key parse operations in 1058526us (4780.2 ops/sec)
Did 1551 RSA 4096 private key parse operations in 1082343us (1433.0 ops/sec)
After:
Did 11532 RSA 2048 private key parse operations in 1084145us (10637.0 ops/sec) [+122.5%]
Did 3542 RSA 4096 private key parse operations in 1036374us (3417.7 ops/sec) [+138.5%]
Bug: b/192484677
Change-Id: I893ebb8886aeb8200a1a365673b56c49774221a2
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49106
Reviewed-by: Adam Langley <agl@google.com>