boringssl

Commit Graph

Author	SHA1	Message	Date
Adam Langley	a56d941c44	Add function to return the name of the FIPS module. Change-Id: I3eab2393d4fe48c900d67240c7decf223d78c2f1 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52425 Commit-Queue: Bob Beck <bbe@google.com> Reviewed-by: Bob Beck <bbe@google.com>	3 years ago
Adam Langley	a75bee5414	Support running tests on non-NEON devices. Change-Id: I7d95d53d4d99cb5b58fc05ee8240577575306b94 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52406 Commit-Queue: David Benjamin <davidben@google.com> Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	59e37765f1	Replace the last strcasecmp with OPENSSL_strcasecmp. strcasecmp is locale-sensitive, which can cause some mishaps. This CL should be a no-op, because this call is only used on Android, and bionic's strcasecmp seems to be ASCII-only. But using OPENSSL_strcasecmp everywhere is easier to reason about. Change-Id: Iecf9bc4da1bb3a4ab87b1e8b1d7f6f6c6e44aceb Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52305 Commit-Queue: David Benjamin <davidben@google.com> Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: Adam Langley <agl@google.com>	3 years ago
Nevine Ebeid	fa3fbda07b	P-256 assembly optimisations for Aarch64. The ARMv8 assembly code in this commit is mostly taken from OpenSSL's `ecp_nistz256-armv8.pl` at `19e277dd19/crypto/ec/asm/ecp_nistz256-armv8.pl` (see Note 1), adapting it to the implementation in p256-x86_64.c. Most of the assembly functions found in `crypto/fipsmodule/ec/asm/p256-x86_64-asm.pl` required to support that code have their analogous functions in the imported OpenSSL ARMv8 Perl assembly implementation with the exception of the functions: - ecp_nistz256_select_w5 - ecp_nistz256_select_w7 An implementation for these functions was added. Summary of modifications to the imported code: * Renamed to `p256-armv8-asm.pl` * Modified the location of `arm-xlate.pl` and `arm_arch.h` * Replaced the `scatter-gather subroutines` with `select subroutines`. The `select subroutines` are implemented for ARMv8 similarly to their x86_64 counterparts, `ecp_nistz256_select_w5` and `ecp_nistz256_select_w7`. * `ecp_nistz256_add` is removed because it was conflicting during the static build with the function of the same name in p256-nistz.c. The latter calls another assembly function, `ecp_nistz256_point_add`. * `__ecp_nistz256_add` renamed to `__ecp_nistz256_add_to` to avoid the conflict with the function `ecp_nistz256_add` during the static build. * l. 924 `add sp,sp,#256` the calculation of the constant, 32(12-4), is not left for the assembler to perform. Other modifications: `beeu_mod_inverse_vartime()` was implemented for AArch64 in `p256_beeu-armv8-asm.pl` similarly to its implementation in `p256_beeu-x86_64-asm.pl`. * The files containing `p256-x86_64` in their name were renamed to, `p256-nistz` since the functions and tests defined in them are hereby running on ARMv8 as well, if enabled. * Updated `delocate.go` and `delocate.peg` to handle the offset calculation in the assembly instructions. * Regenerated `delocate.peg.go`. Notes: 1- The last commit in the history of the file is in master only, the previous commits are in OpenSSL 3.0.1 2- This change focuses on AArch64 (64-bit architecture of ARMv8). It does not support ARMv4 or ARMv7. Testing the performance on Armv8 platform using -DCMAKE_BUILD_TYPE=Release: Before: ``` Did 2596 ECDH P-256 operations in 1093956us (2373.0 ops/sec) Did 6996 ECDSA P-256 signing operations in 1044630us (6697.1 ops/sec) Did 2970 ECDSA P-256 verify operations in 1084848us (2737.7 ops/sec) ``` After: ``` Did 6699 ECDH P-256 operations in 1091684us (6136.4 ops/sec) Did 20000 ECDSA P-256 signing operations in 1012944us (19744.4 ops/sec) Did 7051 ECDSA P-256 verify operations in 1060000us (6651.9 ops/sec) ``` Change-Id: I9fdef12db365967a9264b5b32c07967b55ea48bd Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51805 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: Adam Langley <agl@google.com>	3 years ago
Adam Langley	27ffcc6e19	Use SHA-256 for the FIPS integrity check everywhere. There are paperwork reasons why it's useful to use the same hash function in all cases. Thus unify on SHA-256 because contexts where SHA-512 is faster, are faster overall and thus less sensitive. Change-Id: I7a782a3adba4ace3257313a24dc8bc213b9d64ec Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52165 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	8c8e7a683f	Update fiat-crypto. The files no longer need to be patched because fiat-crypto now has its own copy of our value barrier. It does, however, require syncing our NO_ASM define with fiat's. fiat-crypto is now licensed under any of MIT, BSD 1-clause, or Apache 2. I've stuck with the MIT one as that's what we were previously importing. No measurable perf difference before/after this CL, with GCC or Clang on x86_64. Change-Id: I2939fd517de37aabdea3ead49150135200a1b112 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52045 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
Adam Langley	8bbefbfeee	Document that \|EC_KEY_generate_fips\| works for both cases. Our FIPS module only claims support for RSA signing/verification, and \|RSA_generate_key_fips\| already performs a sign/verify pair-wise consistency test (PCT). For ECDSA, \|EC_KEY_generate_fips\| performs a sign/verify PCT too. But when \|EC_KEY_generate_fips\| is used for key agreement a sign/verify PCT may not be correct. The FIPS IG[1], page 60, says: > Though not a CAST, a pairwise consistency test (PCT) shall be > conducted for every generated public and private key pair for the > applicable approved algorithm (per ISO/IEC 19790:2012 Section > 7.10.3.3). To further clarify, at minimum, the PCT that is required by > the underlying algorithm standard (e.g. SP 800- 56Arev3 or SP > 800-56Brev2) shall be performed. SP 800-56Ar3, page 36, says: > For an ECC key pair (d, Q): Use the private key, d, along with the > generator G and other domain parameters associated with the key pair, > to compute dG (according to the rules of elliptic-curve arithmetic). > Compare the result to the public key, Q. If dG is not equal to Q, then > the pair-wise consistency test fails But \|EC_KEY_generate_fips\| has always done that via \|EC_KEY_check_key\|. So I believe that \|EC_KEY_generate_fips\| works for either case. This change documents that. [1] FIPS 140-3 IG dated 2022-03-14 and with SHA-256 2f232f7f5839e3263284d71c35771c9fdf2e505b02813be999377030c56b37e4 Change-Id: I4b4e2ed92ae3d59e2f2404c41694abeb3eb283f4 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51988 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	972ab52238	Allow the integrity test to be run on demand. Change-Id: If45a98427516c5a26f2048adb8f8d0415417dcf8 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51987 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	c6e8f3ed08	Add a function to return a FIPS version. We need a function that returns a version that links to a certificate. Previously we have used the git hash as the version of our modules but the source cannot contain its own hash. Thus this change defines a new format for FIPS module versions which will be filled in once we're ready to define a version. Change-Id: Ie4641945119106bc47e8da94ed8a45a86abb6f92 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51986 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	7f4057ec10	Add a function to tell if an algorithm is FIPS approved. Change-Id: I934376ead1bc3e4e8349540c4a3da99cd0b49181 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51985 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	c7a3c46574	Don't loop forever in BN_mod_sqrt on invalid inputs. BN_mod_sqrt implements the Tonelli–Shanks algorithm, which requires a prime modulus. It was written such that, given a composite modulus, it would sometimes loop forever. This change fixes the algorithm to always terminate. However, callers must still pass a prime modulus for the function to have a defined output. In OpenSSL, this loop resulted in a DoS vulnerability, CVE-2022-0778. BoringSSL is mostly unaffected by this. In particular, this case is not reachable in BoringSSL from certificate and other ASN.1 elliptic curve parsing code. Any impact in BoringSSL is limited to: - Callers of EC_GROUP_new_curve_GFp that take untrusted curve parameters - Callers of BN_mod_sqrt that take untrusted moduli This CL updates documentation of those functions to clarify that callers should not pass attacker-controlled values. Even with the infinite loop fixed, doing so breaks preconditions and will give undefined output. Change-Id: I64dc1220aaaaafedba02d2ac0e4232a3a0648160 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51925 Reviewed-by: Adam Langley <agl@google.com> Reviewed-by: Martin Kreichgauer <martinkr@google.com> Commit-Queue: Adam Langley <agl@google.com>	3 years ago
Adam Langley	d258de7248	Include rsa/internal.h for \|...no_self_test\| functions. Change-Id: I9aac529f181068746c5099ad08b6471887184202 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51725 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	4b55af0fc5	Make FFDH self tests lazy. Change-Id: I7ac046a2422d79b77a231ab65325402658144390 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51566 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	3053b739ba	Make ECC self tests lazy. Change-Id: I1b7e4bd5403031232fc1e1ffb3c6e40decac23b9 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51565 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	6595ddb354	Include the policy document for the most recent FIPS validation. NIST publishes the PDFs of the security policy documents (although the latest one is still missing). We include the docx sources to help others who might be doing a rebrand certification of BoringCrypto. Change-Id: I5c1511d53ec1d09d257d3aab1301486c364b660b Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51505 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	4d955d20d2	Check static CPU capabilities on x86. On Arm, our CRYPTO_is_*_capable functions check the corresponding preprocessor symbol. This allows us to automatically drop dynamic checks and fallback code when some capability is always avilable. This CL does the same on x86, as well as consolidates our OPENSSL_ia32cap_P checks in one place. Since this abstraction is incompatible with some optimizations we do around OPENSSL_ia32cap_get() in the FIPS module, I've marked the symbol __attribute__((const)), which is enough to make GCC and Clang do the optimizations for us. (We already do the same to DEFINE_BSS_GET.) Most x86 platforms support a much wider range of capabilities, so this is usually a no-op. But, notably, all x86_64 Mac hardware has SSSE3 available, so this allows us to statically drop an AES implementation. (On macOS with -Wl,-dead_strip, this seems to trim 35080 bytes from the bssl binary.) Configs like -march=native can also drop a bunch of code. Update-Note: This CL may break build environments that incorrectly mark some instruction as statically available. This is unlikely to happen with vector instructions like AVX, where the compiler could freely emit them anyway. However, instructions like AES-NI might be set incorrectly. Change-Id: I44fd715c9887d3fda7cb4519c03bee4d4f2c7ea6 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51548 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	31ece98da1	Align rsaz_avx2_preferred with x86_64-mont5.pl. x86_64-mont5.pl checks for both BMI1 and BMI2, because the MULX path also uses the ANDN instruction. Some history here from upstream: a5bb5bca52f57021a4017521c55a6b3590bbba7a, dated 2013-10-03, added the MULX path to x86_64-mont5.pl. At the time, the cpuid check was BMI2+ADX. (MULX comes from BMI2.) 37de2b5c1e370b493932552556940eb89922b027, dated 2013-10-09, made BN_mod_exp_mont_consttime prefer the MULX mont5 code over the AVX2 rsaz code, with a matching BMI2+ADX cpuid check. 8fc8f486f7fa098c9fbb6a6ae399e3c6856e0d87, dated 2016-01-25, tweaked some code to use the ANDN instruction, from BMI1. Correspondingly, it changed the cpuid check to be BMI1+BMI2+ADX. The BN_mod_exp_mont_consttime check was left unchanged. This CL fixes our version of the BN_mod_exp_mont_consttime check to match the assembly, by also checking BMI1. (This should be a no-op. Presumably any processor with BMI2 also has BMI1.) Change-Id: Ib0cacc7e2be840d970460eef4dd9ded7fb24231c Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51547 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	17c8c81104	Enable SHA-NI optimizations for SHA-256. While our CI machines don't have these instructions, Intel SDE covers them. Benchmarks on an AMD EPYC machine (VM on Google Compute Engine): Before: Did 13619000 SHA-256 (16 bytes) operations in 3000147us (72.6 MB/sec) Did 3728000 SHA-256 (256 bytes) operations in 3000566us (318.1 MB/sec) Did 920000 SHA-256 (1350 bytes) operations in 3002829us (413.6 MB/sec) Did 161000 SHA-256 (8192 bytes) operations in 3017473us (437.1 MB/sec) Did 81000 SHA-256 (16384 bytes) operations in 3029284us (438.1 MB/sec) After: Did 25442000 SHA-256 (16 bytes) operations in 3000010us (135.7 MB/sec) [+86.8%] Did 10706000 SHA-256 (256 bytes) operations in 3000171us (913.5 MB/sec) [+187.2%] Did 3119000 SHA-256 (1350 bytes) operations in 3000470us (1403.3 MB/sec) [+239.3%] Did 572000 SHA-256 (8192 bytes) operations in 3001226us (1561.3 MB/sec) [+257.2%] Did 289000 SHA-256 (16384 bytes) operations in 3006936us (1574.7 MB/sec) [+259.4%] Although we don't currently have unwind tests in CI, I ran the unwind tests manually on the same VM. They pass, after adding in the missing .cfi_startproc and .cfi_endproc lines. Change-Id: I45b91819e7dcc31e63813843129afa146d0c9d47 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51546 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
Adam Langley	0da6b4805b	Don't call a non-test file *test.h. fips_break_test.h is a bad name because generate_build_files.py thinks that it's a test file, which it is, but one that's needed in the main build. Thanks to Svilen Kanev for noting this. That header doesn't particularly carry its weight. The idea was that rebuilding the break test wouldn't need to rebuild everything if that logic was isolated in its own header. But we only have to rebuild once now, so whatever. There's already a block of crypto/internal.h with very similar stuff; it can go there. Change-Id: Ifb479eafd4df9a7aac4804cae06ba87257c77fc3 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51485 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	1c2e61efef	Make RSA self-test lazy. We need to ensure that all public functions that end up doing a cryptographic RSA operation run the self-tests first. We could do that by putting calls in the lower-most functions but the self-tests must run operations without creating a cycle. Therefore calls are placed as low down as possible except where it would conflict with the self-tests. Some functions need to be split so that there's a private version that doesn't require that the self tests have passed. Here's the call-graph that I used for this: ┌───────────────────────────┐ │ private_decrypt │ └───────────────────────────┘ │ │ ▼ ┌───────────────────────────┐ │ decrypt │ └───────────────────────────┘ │ │ ▼ ┌───────────────────────────┐ │ default_decrypt │ └───────────────────────────┘ │ │ ▼ ┌───────────────────────────┐ │ private_transform │ ◀┐ └───────────────────────────┘ │ │ │ │ │ ▼ │ ┌───────────────────────────┐ │ │ default_private_transform │ │ └───────────────────────────┘ │ ┌───────────────────────────┐ │ │ private_encrypt │ │ └───────────────────────────┘ │ ┌───────────────┐ │ │ │ sign_pss_mgf1 │ │ │ └───────────────┘\ ▼ │ ┌────────┐ ┌───────────────────────────┐ │ │ sign │ ──▶ │ sign_raw │ │ └────────┘ └───────────────────────────┘ │ │ │ │ │ ▼ │ ┌───────────────────────────┐ │ │ default_sign_raw │ ─┘ └───────────────────────────┘ ┌−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−┐ ╎ Verification ╎ ╎ ╎ ╎ ┌───────────────────────────┐ ╎ ╎ │ public_decrypt │ ╎ ╎ └───────────────────────────┘ ╎ ╎ │ ╎ ╎ │ ╎ ╎ │ ╎ ┌−−−−−−−−−−−−−−−− │ ╎ ╎ ▼ ╎ ╎ ┌────────┐ ┌───────────────────────────┐ ╎ ╎ │ verify │ ────▶ │ verify_raw │ ╎ ╎ └────────┘ └───────────────────────────┘ ╎ ╎ ╎ └−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−┘ ┌−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−┐ ╎ Encryption ╎ ╎ ╎ ╎ ┌───────────────────────────┐ ╎ ╎ │ public_encrypt │ ╎ ╎ └───────────────────────────┘ ╎ ╎ │ ╎ ╎ │ ╎ ╎ ▼ ╎ ╎ ┌───────────────────────────┐ ╎ ╎ │ encrypt │ ╎ ╎ └───────────────────────────┘ ╎ ╎ ╎ └−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−┘ Speed difference looks to be in the noise. Before: Did 19716 RSA 2048 signing operations in 10050000us (1961.8 ops/sec) Did 712000 RSA 2048 verify (same key) operations in 10007156us (71149.1 ops/sec) Did 590000 RSA 2048 verify (fresh key) operations in 10004296us (58974.7 ops/sec) Did 101866 RSA 2048 private key parse operations in 10090285us (10095.5 ops/sec) Did 2919 RSA 4096 signing operations in 10019359us (291.3 ops/sec) Did 203000 RSA 4096 verify (same key) operations in 10008421us (20282.9 ops/sec) Did 175000 RSA 4096 verify (fresh key) operations in 10026353us (17454.0 ops/sec) Did 30900 RSA 4096 private key parse operations in 10090073us (3062.4 ops/sec) After: Did 19525 RSA 2048 signing operations in 10000499us (1952.4 ops/sec) Did 706000 RSA 2048 verify (same key) operations in 10002172us (70584.7 ops/sec) Did 588000 RSA 2048 verify (fresh key) operations in 10010856us (58736.2 ops/sec) Did 101864 RSA 2048 private key parse operations in 10063474us (10122.2 ops/sec) Did 2919 RSA 4096 signing operations in 10037480us (290.8 ops/sec) Did 203000 RSA 4096 verify (same key) operations in 10026966us (20245.4 ops/sec) Did 175000 RSA 4096 verify (fresh key) operations in 10032281us (17443.7 ops/sec) Did 31416 RSA 4096 private key parse operations in 10031047us (3131.9 ops/sec) Change-Id: I8dec8a33066717b7078f160e3f93c33cd354bb0c Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51426 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	263f489973	Add link to new Android FIPS certificate. Change-Id: I6dabeb0a9090a4ddcafc88a3bc53b2c28c30f14a Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51465 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	8f7cb2f7c6	Drop, now unused, KAT value. Change-Id: Ief328bb2a8b6264226a89233c9fba0e4621de9d7 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51425 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	ea9fb94c35	Drop CAVP code. All FIPS testing is done with ACVP now. We can delete all the CAVP stuff. Change-Id: I459873474e40b0371f9cf760090a130ef9a90a8c Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51330 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: Adam Langley <agl@google.com>	3 years ago
Adam Langley	d04c32a3d8	Break FIPS tests differently. FIPS validation requires showing that the continuous and start-up tests are effective by breaking them. Traditionally BoringSSL used #defines that tweaked the expected values. However, 140-3 now requires that the inputs be changed, not the expected outputs. Also, the number of tests is going to increase. Since slower platforms already took too long to compile BoringSSL n times (once for each test to break) we want something faster too. Therefore all the known-answer tests (KATs) are changed such that a Go program can find and replace the input value in order to break them. Thus we only need to recompile once to disable the integrity test. The runtime tests still need a #define to break, but that #define is now put in a header file so that only the module need be recompiled, not everything as in the previous system. Change-Id: Ib621198e6ad02253e29af0ccd978e3c3830ad54c Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51329 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: Adam Langley <agl@google.com>	3 years ago
Adam Langley	f8235e4993	Don't forget hmac.h in self_check.h. Builds that compile the FIPS stuff separately don't get this header from other files. Change-Id: I8a1b30ae360b08d4f4b9f804cd234998889477bc Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51405 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	9cad13eea1	Perform SHA-$x and HMAC KAT before integrity check. AS10.20 requires that the self-test for the integrity algorithm pass before the integrity check itself. IG 10.3.A requires an HMAC self-test now. Therefore run these tests before the integrity check. Since we also need the ability to run all self-tests, both SHA self-tests and the HMAC test are run again when running self-tests. I'm assuming that they're so fast that it doesn't matter. Change-Id: I6b23b6fd3cb6107edd7420bc8680780719bd41d2 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51328 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	b0ed28e257	Add a couple of spaces to `check_test`. The word “calculated” is two letters longer than “expected” and it's nice to line up the ouptuts. Change-Id: Idac70e62d98fbe26c430f03f4643ba295e40853d Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51327 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	15565a8984	Split FIPS KATs into fast and slow groups. The provision of FIPS that allowed the tests to be skipped based on a flag-file has been removed in 140-3. Therefore we expect to run the fast KATs on start-up, but to defer to slower ones until the functionality in question is first used. So this change splits off the fast KATs and removes support for skipping KATs based on a flag-file. Change-Id: Ib24cb1739cfef93e4a1349d786a0257ee1083cfb Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51326 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	a919539777	Move DES out of the FIPS module. FIPS no longer likes it. Change-Id: I32a4ba93a5849927ff75aa72b816cdc669e8a0af Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51325 Reviewed-by: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	d1593f54c9	Make EVP_AEAD_CTX_free accept NULL. This matches our other free functions. Fixed: 473 Change-Id: Ie147995c2f5b429f78e95cfc9a08ed54181af94e Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/51005 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
Adam Langley	ec476ef044	Zero out the values from the integrity check. 140-3 says > the zeroisation of protected and unprotected SSPs > shall be performed in the following scenarios: > ... > For temporary value(s) generated during the integrity test of the > module’s software or firmware upon completion of the integrity test. (IG 9.7.B) Change-Id: I911f294860bf33b13b2c997fc633c9bda777fc48 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50945 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	a94c267787	Don't use __ARMEL__/__ARMEB__ in aarch64 assembly GCC's __ARMEL__ and __ARMEB__ defines denote little- and big-endian arm, respectively. They are not defined on aarch64, which instead use __AARCH64EL__ and __AARCH64EB__. However, OpenSSL's assembly originally used the 32-bit defines on both platforms and even define __ARMEL__ and __ARMEB__ in arm_arch.h. This is less portable and can even interfere with other headers, which use __ARMEL__ to detect little-endian arm. (Our own base.h believes __ARMEL__ implies 32-bit arm. We just happen to check __AARCH64EL__ first. base.h is probably also always included before arm_arch.h.) Over time, the aarch64 assembly has switched to the correct defines, such as in 32bbb62ea634239e7cb91d6450ba23517082bab6. This commit finishes the job. (There is an even more official endianness detector, __ARM_BIG_ENDIAN in the Arm C Language Extensions. But I've stuck with the GCC ones here as that would be a larger change.) See also https://github.com/openssl/openssl/pull/17373 Change-Id: Ic04ff85782e6599cdeaeb33d12c2fa8edc882224 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50848 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	661266ea06	Move CPU detection symbols to crypto/internal.h. These symbols were not marked OPENSSL_EXPORT, so they weren't really usable externally anyway. They're also very sensitive to various build configuration toggles, which don't always get reflected into projects that include our headers. Move them to crypto/internal.h. Change-Id: I79a1fcf0b24e398d75a9cc6473bae28ec85cb835 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50846 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	1e15682f1a	Enable SHA-512 ARM acceleration when available. This imports the changes to sha512-armv8.pl from upstream's af0fcf7b4668218b24d9250b95e0b96939ccb4d1. Tweaks needed: - Add an explicit .text because we put .LK$BITS in .rodata for XOM - .LK$bits and code are in separate sections, so use adrp/add instead of plain adr - Where glibc needs feature flags to enable pthread_rwlock, Apple interprets _XOPEN_SOURCE as a request to disable Apple extensions. Tighten the condition on the _XOPEN_SOURCE check. Added support for macOS and Linux, tested manually on an ARM Mac and a VM, respectively. Fuchsia and Windows do not currently have APIs to expose this bit, so I've left in TODOs. Benchmarks from an Apple M1 Max: Before: Did 4647000 SHA-512 (16 bytes) operations in 1000103us (74.3 MB/sec) Did 1614000 SHA-512 (256 bytes) operations in 1000379us (413.0 MB/sec) Did 439000 SHA-512 (1350 bytes) operations in 1001694us (591.6 MB/sec) Did 76000 SHA-512 (8192 bytes) operations in 1011821us (615.3 MB/sec) Did 39000 SHA-512 (16384 bytes) operations in 1024311us (623.8 MB/sec) After: Did 10369000 SHA-512 (16 bytes) operations in 1000088us (165.9 MB/sec) [+123.1%] Did 3650000 SHA-512 (256 bytes) operations in 1000079us (934.3 MB/sec) [+126.2%] Did 1029000 SHA-512 (1350 bytes) operations in 1000521us (1388.4 MB/sec) [+134.7%] Did 175000 SHA-512 (8192 bytes) operations in 1001874us (1430.9 MB/sec) [+132.5%] Did 89000 SHA-512 (16384 bytes) operations in 1010314us (1443.3 MB/sec) [+131.4%] (This doesn't seem to change the overall SHA-256 vs SHA-512 performance question on ARM, when hashing perf matters. SHA-256 on the same chip gets up to 2454.6 MB/s.) In terms of build coverage, for now, we'll have build coverage everywhere and test coverage on Chromium, which runs this code on macOS CI. We should request a macOS ARM64 bot for our standalone CI. Longer term, we need a QEMU-based builder to test various features. QEMU seems to have pretty good coverage of all this, which will at least give us Linux. I haven't added an OPENSSL_STATIC_ARMCAP_SHA512 for now. Instead, we just look at the standard __ARM_FEATURE_SHA512 define. Strangely, the corresponding -march tag is not sha512. Neither GCC and nor Clang have -march=armv8-a+sha512. Instead, -march=armv8-a+sha3 implies both __ARM_FEATURE_SHA3 and __ARM_FEATURE_SHA512! Yet everything else seems to describe the SHA512 extension as separate from SHA3. https://developer.arm.com/architectures/system-architectures/software-standards/acle Update-Note: Consumers with a different build setup may need to limit -D_XOPEN_SOURCE=700 to Linux or non-Apple platforms. Otherwise, <sys/types.h> won't define some typedef needed by <sys/sysctl.h>. If you see a build error about u_char, etc., being undefined in some system header, that is probably the cause. Change-Id: Ia213d3796b84c71b7966bb68e0aec92e5d7d26f0 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50807 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	af561c221d	Sync sha512-armv8.pl up to 753316232243ccbf86b96c1c51ffcb41651d9ad5. This imports 753316232243ccbf86b96c1c51ffcb41651d9ad5, 46f4e1bec51dc96fa275c168752aa34359d9ee51, and 32bbb62ea634239e7cb91d6450ba23517082bab6. The last commit fixes a detection of big-endian aarch64 in the kernel, which we do not support at all, but is imported to reduce the upstream diff. Though it points out a messy part of arm_arch.h: __ARMEL__ and __ARMEB__ are specific to 32-bit ARM. __AARCH64EB__ and __AARCH64EL__ are the 64-bit ones. But OpenSSL's arm_arch.h defines __ARME[LB]__ for aarch64 and uses it in perlasm. We should fix the files upstream to look at the aarch64 ones. (Indeed our own base.h assumes __ARMEL__ implies 32-bit ARM.) Change-Id: I6c2241e103a97e8c3599cdfa43dcc6f30d4a2581 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50806 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	e90cf82acc	Import sha512-armv8.pl transforms from upstream NEON code. We currently have two aarch64 SHA-256 implementations: one using general-purpose registers and one using the SHA-256 extensions. Upstream's 866e505e0d663158b0fe63a7fb7455eebacc6470 added a NEON version. This CL syncs the transforms at the bottom of the file, to avoid potential mistranslations in future imports. It doesn't change the output for our current assembly. Skips the NEON implementation itself for now. It only helps processors without SHA-256 instructions. While Android does not actually mandate the cryptography extensions on ARMv8, most devices have it. Additionally, this file does CPU dispatch in assembly, without taking advantage of static information. We'd end up shipping both fallback SHA-256 implementations. This is particularly silly because NEON is mandatory in ARMv8-A anyway. (Does anyone build us on -R or -M? Probably not?) (If we later have a reason to import it, the binary size cost isn't that significant. Moreover, the NEON fallback is actually slightly smaller than the non-NEON fallback, so if we move CPU dispatch to C, importing may even be worthwhile.) Change-Id: I3c8ca6e77e4e6d1299f975c407cbcf4c9c240523 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50805 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	9bcc12d540	Import a few test vectors from OpenSSL. Test vectors from `e9e726506c`. We did not have assembly file in question, but import the test vectors anyway. Change-Id: Ia18698979bc0055bae9105280296891eb7faf9b5 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50785 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	4f1fae3043	Fix the easy -Wformat-signedness errors. GCC has a warning that complains about even more type mismatches in printf. Some of these are a bit messy and will be fixed in separate CLs. This covers the easy ones. The .*s stuff is unfortunate, but printf has no size_t-clean string printer. ALPN protocol lengths are bound by uint8_t, so it doesn't really matter. The IPv6 printing one is obnoxious and arguably a false positive. It's really a C language flaw: all types smaller than int get converted to int when you do arithmetic. So something like this first doesn't overflow the shift because it computes over int, but then the result overall is stored as an int. uint8_t a, b; (a << 8) \| b On the one hand, this fixes a "missing" cast to uint16_t before the shift. At the same time, the incorrect final type means passing it to %x, which expects unsigned int. The compiler has forgotten this value actually fits in uint16_t and flags a warning. Mitigate this by storing in a uint16_t first. The story doesn't quite end here. Arguments passed to variadic functions go through integer promotion[0], so the argument is still passed to snprintf as an int! But then va_arg allows for a signedness mismatch[1], provided the value is representable in both types. The combination means that %x, though actually paired with unsigned, also accept uint8_t and uint16_t, because those are guaranteed to promote to an int that meets [1]. GCC recognizes [1] applies here. (There's also PRI16x, but that's a bit tedious to use and, in glibc, is defined as plain "x" anyway.) [0] https://en.cppreference.com/w/c/language/conversion#Default_argument_promotions [1] https://en.cppreference.com/w/c/variadic/va_arg Bug: 450 Change-Id: Ic1d41356755a18ab922956dd2e07b560470341f4 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50765 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: Adam Langley <agl@google.com>	3 years ago
David Benjamin	cd0b767492	Add BN_GENCB_new, BN_GENCB_free, and RSA_test_flags. OpenSSL 1.1.0 made this structure opaque. I don't think we particularly need to make it opaque, but external code uses it. Also add RSA_test_flags. Change-Id: I136d38e72ec4664c78f4d1720ec691f5760090c1 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50605 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	16a94930ac	Add various OpenSSL compatibility functions. The non-_ex EVP_CIPHER_CTX Final functions are a bit interesting. Unlike EVP_DigestFinal(_ex), where the non-_ex version calls EVP_MD_CTX_cleanup for you, the EVP_CIPHER_CTX ones do not automatically cleanup. EVP_CipherFinal and EVP_CipherFinal_ex are identical in all releases where they exist. This appears to date to OpenSSL 0.9.7: Prior to OpenSSL 0.9.7, EVP_MD_CTX and EVP_CIPHER_CTX did not use void* data fields. Instead, they just had a union of context structures for every algorithm OpenSSL implemented. EVP_MD_CTX was truly cleanup-less. There were no EVP_MD_CTX_init or EVP_MD_CTX_cleanup functions at all. EVP_DigestInit filled things in without reference to the previous state. EVP_DigestFinal didn't cleanup because there was nothing to cleanup. EVP_CIPHER_CTX was also a union, but for some reason did include EVP_CIPHER_CTX_init and EVP_CIPHER_CTX_cleanup. EVP_CIPHER_CTX_init seemed to be optional: EVP_CipherInit with non-NULL EVP_CIPHER similarly didn't reference the previous state. EVP_CipherFinal did not call EVP_CIPHER_CTX_cleanup, but EVP_CIPHER_CTX_cleanup didn't do anything. It called an optional cleanup hook on the EVP_CIPHER, but as far as I can tell, no EVP_CIPHER implemented it. Then OpenSSL 0.9.7 introduced ENGINE. The union didn't work anymore, so EVP_MD_CTX and EVP_CIPHER_CTX contained void* with allocated type-specific data. The introduced EVP_MD_CTX_init and EVP_MD_CTX_cleanup. For (imperfect!) backwards compatibility, EVP_DigestInit and EVP_DigestFinal transparently called init/cleanup for you. EVP_DigestInit_ex and EVP_DigestFinal_ex became the more flexible versions that left init/cleanup to the caller. EVP_CIPHER_CTX got the same treatment with EVP_CipherInit/EVP_CipherInit_ex, but not EVP_CipherFinal/EVP_CipherFinal_ex. The latter did the same thing. The history seems to be that 581f1c84940d77451c2592e9fa470893f6c3c3eb introduced the Final/Final_ex split, with the former doing an auto-cleanup, then 544a2aea4ba1fad76f0802fb70d92a5a8e6ad85a undid it. Looks like the motivation is that EVP_CIPHER_CTX objects are often reused to do multiple operations with a single key. But they missed that the split functions are now unnecessary. Amusingly, OpenSSL's documentation incorrectly said that EVP_CipherFinal cleaned up after the call until it was fixed in 538860a3ce0b9fd142a7f1a62e597cccb74475d3. The fix says that some releases cleaned up, but there were, as far as I can tell, no actual releases with that behavior. I've put the new Final functions in the deprecated section, purely because there is no sense in recommending two different versions of the same function to users, and Final_ex seems to be more popular. But there isn't actually anything wrong with plain Final. Change-Id: Ic2bfda48fdcf30f292141add8c5f745348036852 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50485 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	ba20a754ed	Remove outdated comment in ECDSA implementation. As of https://boringssl-review.googlesource.com/26968, digest_to_scalar should output a fully-reduced value. Change-Id: I9fccc62413b17184eb3aa6fa5cd87d7e7849e2eb Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50325 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com>	3 years ago
David Benjamin	91b8924969	Switch kModuleDigestSize to a macro. Although the compiler will hopefully optimize it out, this is technically a VLA. The new Android NDK now warns about this. Change-Id: Ib9f38dc73c40e90ab61105f29a635c453f1477a1 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50185 Commit-Queue: David Benjamin <davidben@google.com> Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	0524538522	Fix BN_CTX usage in BN_mod_sqrt malloc error paths. Bug: 442 Change-Id: I925eb8d4c4e60dd58d8aaf6010df9783e6ba0837 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49825 Commit-Queue: David Benjamin <davidben@google.com> Reviewed-by: Adam Langley <agl@google.com>	3 years ago
David Benjamin	fa6ced9512	Extract common rotl/rotr functions. We have a ton of per-file rotation functions, often with generic names that do not tell you whether they are uint32_t vs uint64_t, or rotl vs rotr. Additionally, (x >> r) \| (x << (32 - r)) is UB at r = 0. (x >> r) \| (x << ((-r) & 31)) works for 0 <= r < 32, which is what cast.c does. GCC and Clang recognize this pattern as a rotate, but MSVC doesn't. MSVC does, however, provide functions for this. We usually rotate by a non-zero constant, which makes this moot, but rotation comes up often enough that it's worth extracting out. Some particular changes to call out: - I've switched sha256.c from rotl to rotr. There was a comment explaining why it differed from the specification. Now that we have both functions, it's simpler to just match the specification. - I've dropped all the inline assembly from sha512.c. Compilers should be able to recognize rotations in 2021. Change-Id: Ia1030e8bfe94dad92514ed1c28777447c48b82f9 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49765 Reviewed-by: Adam Langley <agl@google.com>	3 years ago
Tamas Petz	dedd23e592	aarch64: Add missing LR validation in 'vpaes_cbc_encrypt' There is an obvious bug there: upon entry to 'vpaes_cbc_encrypt' LR may get signed. However, on the 'cbc_abort' path the LR is not going to be unsigned before 'ret' is executed. Found by manual code inspection. Co-authored-by: Russ Butler <russ.butler@arm.com> Change-Id: I646cdfaee28db59aafbbd412d4bb6ba022eff15b Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49605 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	4 years ago
Adam Langley	1c2473ebae	Add FIPS counters for AES-GCM in EVP_AEAD. BUG=b/158221316 Change-Id: I42693f760aa2852902d72622e109c5d9cac2c4d9 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49485 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: David Benjamin <davidben@google.com>	4 years ago
Shelley Vohr	0446b59427	Add maskHash to RSA_PSS_PARAMS for compat This CL adds a maskHash member to the rsa_pss_params_st struct for increased compatibility with OpenSSL: https://source.chromium.org/chromium/chromium/src/+/main:third_party/perl/c/include/openssl/rsa.h;l=282-289 Node.js recently began to make use of this member in https://github.com/nodejs/node/pull/39851 and without this member Electron sees compilation errors. Change-Id: Ibd18a31605b0a715edb279a3bca4b4f05e679767 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49365 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com>	4 years ago
David Benjamin	c0fcb4e245	Silence a GCC false positive warning. GCC 11.2.1 reportedly warns that CTR_DRBG_init may be passed an uninitialized personalization buffer. This appears to be a false positive, because personalization_len will be zero. But it's easy enough to zero-initialize it, so silence the warning. Bug: 432 Change-Id: I20f6b74e09f19962e8cae37d45090ff3d1c0215d Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49245 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com>	4 years ago
David Benjamin	c65543b7a9	Make RSA_check_key more than 2x as fast. The bulk of RSA_check_key is spent in bn_div_consttime, which is a naive but constant-time long-division algorithm for the few places that divide by a secret even divisor: RSA keygen and RSA import. RSA import is somewhat performance-sensitive, so pick some low-hanging fruit: The main observation is that, in all but one call site, the bit width of the divisor is public. That means, for an N-bit divisor, we can skip the first N-1 iterations of long division because an N-1-bit remainder cannot exceed the N-bit divisor. One minor nuisance is bn_lcm_consttime, used in RSA keygen has a case that does not have a public bit width. Apply the optimization there would leak information. I've implemented this as an optional public lower bound on num_bits(divisor), which all but that call fills in. Before: Did 5060 RSA 2048 private key parse operations in 1058526us (4780.2 ops/sec) Did 1551 RSA 4096 private key parse operations in 1082343us (1433.0 ops/sec) After: Did 11532 RSA 2048 private key parse operations in 1084145us (10637.0 ops/sec) [+122.5%] Did 3542 RSA 4096 private key parse operations in 1036374us (3417.7 ops/sec) [+138.5%] Bug: b/192484677 Change-Id: I893ebb8886aeb8200a1a365673b56c49774221a2 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/49106 Reviewed-by: Adam Langley <agl@google.com>	4 years ago
David Benjamin	8648c53690	Refer to RFCs consistently. We were a mix of "RFC1234" and "RFC 1234". Apparently there is actually an answer for this, which is with a space textually and without a space in the citation/reference tag: https://datatracker.ietf.org/doc/html/rfc7322#section-3.5 Change-Id: I0c44023163fe3a2a3ffe28cbc644d4c952dc8f1e Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/48965 Reviewed-by: Adam Langley <agl@google.com>	4 years ago

1 2 3

122 Commits (3f180b8221b315eccb5629237ed4baf148307f9a)