Decompose RandenPoolSeedSeq from NonsecureURBGBase.
Adjust how the RandenPoolSeedSeq detects contiguous buffers passed to the generate function. Previously it made incorrect assumptions regarding the contiguous concept, which have been replaced with some type-based tests for a small number of known contiguous random access iterator types, including raw pointers.
PiperOrigin-RevId: 452564114
Change-Id: Idab1df9dd078d8e5c565c7fa7ccb9c0d3d392ad2
We were using `init_type`s for temp values that we would move into slots, but in this case, we need to have actual slots. We use node handles for managing slots outside of nodes.
Also, in btree::copy_or_move_values_in_order, pass the slots from the iterators rather than references to values. This allows for moving from map keys instead of copying for standard layout types.
In the test, fix a couple of ClangTidy warnings from missing includes and calling `new` instead of `make_unique`.
PiperOrigin-RevId: 452062967
Change-Id: I870e89ae1aa5b3cfa62ae6e75b73ffc3d52e731c
Timeouts were once necessary when the SpinLock Unlock used an atomic
store and could therefore have a race and a missed wakeup, however,
the Unlock path now uses an atomic exchange, so the missed wakeup
cannot happen.
Fixes#1179
PiperOrigin-RevId: 452047517
Change-Id: I844944879b51b7f7ddac148e063a376cddd0d05a
This fixes an overload that is ambiguous for some toolchains, because
0U does not always refer to a uint32_t (on some toolchains, uint32_t
is an unsigned long).
PiperOrigin-RevId: 451962182
Change-Id: Id13700817ea3eb6d04e2cc02f20726040eb447fb
Without the change absl-cpp build fails on this week's gcc-13 snapshot as:
/build/abseil-cpp/absl/strings/internal/str_format/extension.h:34:33: error: found ':' in nested-name-specifier, expected '::'
34 | enum class FormatConversionChar : uint8_t;
| ^
| ::
Benchmarks: https://pastebin.com/tZ7dr67W. Works well especially on smaller ranges.
After a week on spending optimizing NEON SIMD where I almost managed to make hash tables work with NEON SIMD without performance hits (still 1 cycle to optimize and I gave up a little), I found an interesting optimization for aarch64 to use cls instruction (count leading sign bits).
The loop has a property that ctrl_ group is not matched against count when the first slot is empty or deleted.
```
void skip_empty_or_deleted() {
while (IsEmptyOrDeleted(*ctrl_)) {
uint32_t shift = Group{ctrl_}.CountLeadingEmptyOrDeleted();
ctrl_ += shift;
slot_ += shift;
}
...
}
```
However, `kEmpty` and `kDeleted` have format of `1xxxxxx0` and `~ctrl & (ctrl >> 7)` always sets the lowest bit to 1.
In naive implementation, it does +1 to start counting zero bits, however, in aarch64 we may start counting one bits immediately. This saves 1 cycle and 5% of iteration performance.
Then it becomes hard to find a supported and sustainable C++ version of it.
`__clsll` is not supported by GCC and was supported only since clang 8, `__builtin_clrsb` is not producing optimal codegen for clang. `__rbit` is not supported by GCC and there is no intrinsic to do that, however, in clang we have `__builtin_bitreverse{32,64}`. For now I decided to enable this only for clang, only if they have appropriate builtins.
PiperOrigin-RevId: 451168570
Change-Id: I7e9256a60aecdc88ced4e6eb15ebc257281b6664
Previously was disabled on iPhone, but still enabled for macOS.
The unscaled cycle clock does not work correctly when run on a VM.
PiperOrigin-RevId: 449876559
Change-Id: I679ade90b43462e8d2794b1a2b32569d59029ed9
Add a new (internal) feature test macro to detect whether the wrappers are no-ops on a given platform.
Note that one-arg __builtin_prefetch(x) is equivalent to __builtin_prefetch(x, 0, 3), per `man BUILTIN_PREFETCH(3)` and gcc docs.
PiperOrigin-RevId: 449508660
Change-Id: I144e750205eec0c956d8dd62bc72e10bdb87c4f7
Both Mutex and CondVar signal PerThreadSem/Waiter after satisfying the wait condition,
as the result the waiting thread may return w/o waiting on the
PerThreadSem/Waiter at all. If the waiting thread then exits, it currently
destroys Waiter object. As the result Waiter::Post can be called on
already destroyed object.
PerThreadSem/Waiter must be type-stable after creation and must not be destroyed.
The futex-based implementation is the only one that is not affected by the bug
since there is effectively nothing to destroy (maybe only UBSan/ASan
could complain about calling methods on a destroyed object).
Here is the problematic sequence of events:
1: void Mutex::Block(PerThreadSynch *s) {
2: while (s->state.load(std::memory_order_acquire) == PerThreadSynch::kQueued) {
3: if (!DecrementSynchSem(this, s, s->waitp->timeout)) {
4: PerThreadSynch *Mutex::Wakeup(PerThreadSynch *w) {
5: ...
6: w->state.store(PerThreadSynch::kAvailable, std::memory_order_release);
7: IncrementSynchSem(this, w);
8: ...
9: }
Consider line 6 is executed, then line 2 observes kAvailable and
line 3 is not called. The thread executing Mutex::Block returns from
the method, acquires the mutex, releases the mutex, exits and destroys
PerThreadSem/Waiter.
Now Mutex::Wakeup resumes and executes line 7 on the destroyed object. Boom!
CondVar uses a similar pattern.
Moreover the semaphore-based Waiter implementation is not even destruction-safe
(the Waiter cannot be used to signal own destruction). So even if Mutex/CondVar
would always pair Waiter::Post with Waiter::Wait before destroying PerThreadSem/Waiter,
it would still be subject to use-after-free bug on the semaphore.
PiperOrigin-RevId: 449159939
Change-Id: I497134fa8b6ce1294a422827c5f0de0e897cea31
CondVar::WaitWithTimeout can live-lock when timeout is racing with Signal/SignalAll
and Signal/SignalAll thread is not scheduled due to priorities, affinity or other
scheduler artifacts. This could lead to stalls of up to tens of seconds in some cases.
PiperOrigin-RevId: 449159670
Change-Id: I64bbd277c1f91964cfba3306ba8a80eeadf85f64
Stop the absl::Cord destructor from running on the constinit cord in
CordTest.ConstinitConstructor. This allows inspecting the cord at any point
while the test is exiting, which is important for the semantics of the test.
PiperOrigin-RevId: 448327386
Change-Id: Icef9faa2b63f1f0ae60b3430dcf6184f5dead885
* Avoid warnings due to deprecation of volatile return types.
Also fix up optional_test.cc due to ABSL_USES_STD_OPTIONAL always being
false in its body.
Bug: chromium:1284275
PiperOrigin-RevId: 447796238
Change-Id: If050206c979c6c08af22e71ff0ea91e7f7932f0c
The old analysis viewed it as birthday attack, which asks how often
there are multiple values in the probe same probe sequence with the
same H2. In their own words, this analysis "breaks down" at around `n
= 12`.
Instead we can answer a simpler question, which is, if a probe
sequence examines `k` objects that are not what we are looking for,
what's the number of calls `==`? The expectation is simply `k/128`.
PiperOrigin-RevId: 446518063
Change-Id: Ie879bd4f6c97979822bc9d550b9e2503b1418c78