We no longer need to define CRYPTO_MUTEX in public headers. This
simplifies a pile of things. First, we can now use pthread_rwlock_t
without any fuss, rather than trying to guess the size on glibc.
As a result, CRYPTO_MUTEX and CRYPTO_STATIC_MUTEX can be merged into one
type. We can almost do this to CRYPTO_refcount_t too. BIO is the one
straggler remaining.
Fixed: 325
Change-Id: Ie93c9f553c0f02ce594b959c041b00fc15ba51d2
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/60611
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Bob Beck <bbe@google.com>
If a process calls fork(), then the child process never forks again, the
child may wish to call RAND_enable_fork_unsafe_buffering(). However,
doing so exposes a bug: we assume that, if the flag is set, we don't
need to worry about fork-safety. But it is possible that the PRNG state
was cloned from another process which does not work.
Concretely, consider a zygote process, e.g. Chromium's. A zygote process
would retain fork-safety, but pass along its PRNG state to each of its
children. If the children never fork, they might disable fork-safety,
hitting this bug. (Chromium does not call this API. This is just a
hypothetical scenario.)
Fix this by reseeding whenever the fork-safety bit changes. This fix
does not strictly depend on the atomics work, but it causes us to
unconditionally sample rand_fork_unsafe_buffering_enabled(). This no
longer causes contention because it's just another atomic load.
This only affects systems without MADV_WIPEONFORK and without fast
RDRAND. If RDRAND is fast, we're always fork-safe and MADV_WIPEONFORK
allows us to efficiently detect forks.
Cq-Include-Trybots: luci.boringssl.try:linux_clang_rel_tsan
Change-Id: I6d0c471c62c951254faf85420a7dc3f4a9d65ee0
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59850
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
We don't take write locks in the PRNG, steady state, but we do take some
read locks: computing fork generation, reading the fork-unsafe buffering
flag, and a FIPS-only artifact of some global state clearing mess. That
last one is completely useless, but it's a consequence of FIPS's
understanding of process exit being comically inconsistent with reality.
Taking read locks is, in principle, parallel, but the cacheline write
causes some contention, even in newer glibcs with faster read locks. Fix
these:
- Use atomic reads to check the fork generation. We only need to lock
when we observe a fork.
- Replace the fork-unsafe buffering flag with an atomic altogether.
- Split state_clear_all_lock into a per-rand_thread_state lock. We still
need a read lock, but a completely uncontended one until process exit.
With many threads, this gives a significant perf boost.
x86_64, non-FIPS, Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz, 30 threads:
Before:
Did 45131875 RNG (16 bytes) operations in 300039649us (150419.7 ops/sec): 2.4 MB/s
Did 44089000 RNG (32 bytes) operations in 300053237us (146937.3 ops/sec): 4.7 MB/s
Did 43328000 RNG (256 bytes) operations in 300058423us (144398.5 ops/sec): 37.0 MB/s
Did 45857000 RNG (1350 bytes) operations in 300095943us (152807.8 ops/sec): 206.3 MB/s
Did 43249000 RNG (8192 bytes) operations in 300102698us (144114.0 ops/sec): 1180.6 MB/s
After:
Did 296204000 RNG (16 bytes) operations in 300009524us (987315.3 ops/sec): 15.8 MB/s
Did 311347000 RNG (32 bytes) operations in 300014396us (1037773.5 ops/sec): 33.2 MB/s
Did 295104000 RNG (256 bytes) operations in 300012657us (983638.5 ops/sec): 251.8 MB/s
Did 255721000 RNG (1350 bytes) operations in 300016481us (852356.5 ops/sec): 1150.7 MB/s
Did 103339000 RNG (8192 bytes) operations in 300040059us (344417.3 ops/sec): 2821.5 MB/s
(Smaller PRNG draws are more impacted because they spend less time in the
DRBG. But they're also more likely because you rarely need to pull 8K of
data out at once.)
x86_64, FIPS, Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz, 30 threads:
Before:
Did 29060000 RNG (16 bytes) operations in 300081190us (96840.5 ops/sec): 1.5 MB/s
Did 31882000 RNG (32 bytes) operations in 300118031us (106231.5 ops/sec): 3.4 MB/s
Did 30925000 RNG (256 bytes) operations in 300113646us (103044.3 ops/sec): 26.4 MB/s
Did 31969000 RNG (1350 bytes) operations in 300096688us (106529.0 ops/sec): 143.8 MB/s
Did 33434000 RNG (8192 bytes) operations in 300093240us (111412.0 ops/sec): 912.7 MB/s
After:
Did 299013000 RNG (16 bytes) operations in 300012167us (996669.6 ops/sec): 15.9 MB/s
Did 289788000 RNG (32 bytes) operations in 300014611us (965913.0 ops/sec): 30.9 MB/s
Did 298699000 RNG (256 bytes) operations in 300013443us (995618.7 ops/sec): 254.9 MB/s
Did 247061000 RNG (1350 bytes) operations in 300018215us (823486.7 ops/sec): 1111.7 MB/s
Did 100479000 RNG (8192 bytes) operations in 300037708us (334887.9 ops/sec): 2743.4 MB/s
On an M1 Pro, it's mostly a wash by default (fewer threads because this chip has fewer cores)
aarch64, M1 Pro, 8 threads:
Before:
Did 23218000 RNG (16 bytes) operations in 80009131us (290191.9 ops/sec): 4.6 MB/s
Did 23021000 RNG (256 bytes) operations in 80007544us (287735.4 ops/sec): 73.7 MB/s
Did 22853000 RNG (1350 bytes) operations in 80013184us (285615.4 ops/sec): 385.6 MB/s
Did 25407000 RNG (8192 bytes) operations in 80008371us (317554.3 ops/sec): 2601.4 MB/s
Did 22128000 RNG (16384 bytes) operations in 80013269us (276554.1 ops/sec): 4531.1 MB/s
After:
Did 23303000 RNG (16 bytes) operations in 80011433us (291245.9 ops/sec): 4.7 MB/s
Did 23072000 RNG (256 bytes) operations in 80008755us (288368.4 ops/sec): 73.8 MB/s
Did 22807000 RNG (1350 bytes) operations in 80013355us (285039.9 ops/sec): 384.8 MB/s
Did 23759000 RNG (8192 bytes) operations in 80010212us (296949.6 ops/sec): 2432.6 MB/s
Did 23193000 RNG (16384 bytes) operations in 80011537us (289870.7 ops/sec): 4749.2 MB/s
This is likely because, without RDRAND or MADV_WIPEONFORK, we draw from
the OS on every call. We're likely bottlenecked by getentropy, whether
it's some internal synchronization or syscall overherad. With
fork-unsafe buffering enabled, this change shows even more significant
wins on the M1 Pro.
aarch64, fork-unsafe buffering, M1 Pro, 8 threads:
Before:
Did 25727000 RNG (16 bytes) operations in 80010579us (321545.0 ops/sec): 5.1 MB/s
Did 25776000 RNG (32 bytes) operations in 80008587us (322165.4 ops/sec): 10.3 MB/s
Did 25780000 RNG (256 bytes) operations in 80006127us (322225.3 ops/sec): 82.5 MB/s
Did 33171250 RNG (1350 bytes) operations in 80002532us (414627.5 ops/sec): 559.7 MB/s
Did 54784000 RNG (8192 bytes) operations in 80005706us (684751.2 ops/sec): 5609.5 MB/s
After:
Did 573826000 RNG (16 bytes) operations in 80000668us (7172765.1 ops/sec): 114.8 MB/s
Did 571329000 RNG (32 bytes) operations in 80000423us (7141574.7 ops/sec): 228.5 MB/s
Did 435043750 RNG (256 bytes) operations in 80000214us (5438032.3 ops/sec): 1392.1 MB/s
Did 229536000 RNG (1350 bytes) operations in 80001888us (2869132.3 ops/sec): 3873.3 MB/s
Did 57253000 RNG (8192 bytes) operations in 80004974us (715618.0 ops/sec): 5862.3 MB/s
Note that, on hardware with RDRAND, the read lock in
rand_fork_unsafe_buffering_enabled() doesn't do much. But without
RDRAND, we hit that on every RAND_bytes call. More importantly, the
subsequent CL will fix a bug that will require us to hit it more
frequently.
I've removed the volatile on g_fork_detect_addr because I think we
didn't need it and this avoids thinking about the interaction between
volatile and atomics. The pointer is passed into madvise, so the
compiler knows the pointer escapes. For it to be invalid, the compiler
would need to go out of its way to model madvise as not remembering the
pointer, which would be incorrect of it for MADV_WIPEONFORK.
Bug: 570
Cq-Include-Trybots: luci.boringssl.try:linux_clang_rel_tsan
Change-Id: Ie6977acd1b8e7639aaa419cf6f4f5f0645bde9d1
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/59849
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
This reverts commit 8ce0e1c14e.
The original commit didn't work on Android because:
a) urandom_test didn't handle the fact that Android requires
getrandom() and will never fall back to /dev/urandom.
b) Android may open files in /dev/__properties__ which
confused urandom_test.
The original change is patchset 1 so the differences build on that.
Change-Id: Ib840ec20d60cb28d126d3d09271b18fbd9ec1371
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/53705
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Auto-Submit: Adam Langley <agl@google.com>
This reverts commit 4259ae8198.
Some Android builders perhaps lack getrandom support.
Change-Id: Ic7537c07dacb31a54adb453ddd5f82a789089eaf
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/53625
Auto-Submit: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
When seeding a DRBG for the first time we currently make two reads: one
to start the CRNGT and a second to read the actual seed. These reads can
be merged to save I/O.
Change-Id: I2a83edf7f3c8b9d6cebcde02195845be9fde19b2
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52526
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
This flag is currently set if DRBG entropy is obtained from RDRAND. It
indicates that we should add kernel entropy when seeding the DRBG. But
this might be true for methods other than RDRAND in the future so this
change renames it accordingly.
Change-Id: I91826178a806e3c6dadebbb844358a7a12e0b09b
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/52525
Reviewed-by: David Benjamin <davidben@google.com>
These symbols were not marked OPENSSL_EXPORT, so they weren't really
usable externally anyway. They're also very sensitive to various build
configuration toggles, which don't always get reflected into projects
that include our headers. Move them to crypto/internal.h.
Change-Id: I79a1fcf0b24e398d75a9cc6473bae28ec85cb835
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/50846
Reviewed-by: Adam Langley <agl@google.com>
Trusty wants to seed from a different RNG than the one that supplies
per-draw entropy. This is a no-op change unless you're substituting in
your own implementations of these functions.
To see that it's a no-op in urandom.c, note that it only changes the
|seed| argument to |fill_with_entropy|. That causes the value of
|extra_getrandom_flags_for_seed_bss_get| to be ORed into the flags,
but that value will always be zero unless it's an Android FIPS build.
Change-Id: Ic8d954df3074559cbf1bfee1ae91a4a2b7e14d9d
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/48485
Reviewed-by: David Benjamin <davidben@google.com>
This improves compatibility with cryptography.io. cryptography.io
doesn't actually care what we return, since the code won't run, but
feigning success seems better than failure. If some application does try
to run this function and checks, returning an error will probably crash
it.
Change-Id: I7a8164753a2f1a7b31dbeb10c7030c5e5fea2bc9
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/46004
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Rather than the FIPS module actively collecting entropy from the CPU or
OS, this change configures Android FIPS to passively receive entropy.
See FIPS IG 7.14 section two.
Change-Id: Ibfc5c5042e560718474b89970199d35b67c21296
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/44305
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>