Changed Arena representation so that fusing links arenas together instead of blocks.

Previously when fusing, we would concatenate all blocks into a single list that lived in the arena root.  From then on, all arenas would add their blocks to this single unified list.

After this CL, arenas keep their distinct list of blocks even after being fused.  Instead of unifying the block list, fuse now puts the arenas themselves into a list, so all arenas in the fused group can be iterated over at any time.

This design makes it easier to keep each individual arena thread-compatible, because fuse and free are now the only mutating operations that touch state that is shared with the entire group.  Read-only operations like `SpaceAllocated()` also iterate the list of arenas, but in a read-only fashion.

(Note: we need tests for SpaceAllocated(), both single-threaded for correctness and multi-threaded for resilience to crashes and data races).

Performance of fuse regresses by 5-20%.  This is somewhat expected as we are performing more atomic operations during a fuse.

```
name                                           old cpu/op   new cpu/op   delta
BM_ArenaOneAlloc                                 18.4ns ± 6%  18.7ns ± 4%   +2.00%  (p=0.016 n=18+18)
BM_ArenaInitialBlockOneAlloc                     5.50ns ± 4%  6.57ns ± 4%  +19.42%  (p=0.000 n=16+17)
BM_ArenaFuseUnbalanced/2                         59.3ns ±10%  68.7ns ± 4%  +15.85%  (p=0.000 n=19+19)
BM_ArenaFuseUnbalanced/8                          479ns ± 5%   540ns ± 8%  +12.57%  (p=0.000 n=18+19)
BM_ArenaFuseUnbalanced/64                        4.50µs ± 4%  4.93µs ± 8%   +9.59%  (p=0.000 n=17+17)
BM_ArenaFuseUnbalanced/128                       9.24µs ± 3%  9.96µs ± 3%   +7.81%  (p=0.000 n=17+17)
BM_ArenaFuseBalanced/2                           63.3ns ±18%  71.0ns ± 4%  +12.14%  (p=0.000 n=19+18)
BM_ArenaFuseBalanced/8                            484ns ± 9%   543ns ±10%  +12.11%  (p=0.000 n=17+16)
BM_ArenaFuseBalanced/64                          4.50µs ± 6%  4.94µs ± 4%   +9.62%  (p=0.000 n=19+17)
BM_ArenaFuseBalanced/128                         9.20µs ± 4%  9.95µs ± 4%   +8.12%  (p=0.000 n=16+19)
BM_LoadAdsDescriptor_Upb<NoLayout>               5.50ms ± 8%  5.69ms ±17%     ~     (p=0.189 n=18+19)
BM_LoadAdsDescriptor_Upb<WithLayout>             6.10ms ± 5%  6.05ms ± 4%     ~     (p=0.258 n=17+18)
BM_LoadAdsDescriptor_Proto2<NoLayout>            11.9ms ±15%  11.6ms ± 5%     ~     (p=0.589 n=19+16)
BM_LoadAdsDescriptor_Proto2<WithLayout>          11.8ms ± 5%  12.4ms ±17%     ~     (p=0.604 n=16+20)
BM_Parse_Upb_FileDesc<UseArena, Copy>            12.1µs ± 8%  12.1µs ± 4%     ~     (p=1.000 n=18+18)
BM_Parse_Upb_FileDesc<UseArena, Alias>           11.8µs ±17%  11.1µs ± 4%     ~     (p=0.104 n=20+17)
BM_Parse_Upb_FileDesc<InitBlock, Copy>           12.0µs ± 5%  11.9µs ± 4%     ~     (p=0.134 n=18+19)
BM_Parse_Upb_FileDesc<InitBlock, Alias>          10.9µs ± 7%  11.0µs ± 4%     ~     (p=0.195 n=17+18)
BM_Parse_Proto2<FileDesc, NoArena, Copy>         24.2µs ± 4%  24.4µs ± 7%     ~     (p=0.767 n=18+18)
BM_Parse_Proto2<FileDesc, UseArena, Copy>        11.6µs ± 5%  11.6µs ± 4%     ~     (p=0.621 n=18+16)
BM_Parse_Proto2<FileDesc, InitBlock, Copy>       11.3µs ± 3%  11.3µs ± 3%     ~     (p=0.743 n=18+18)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias>    12.7µs ± 8%  12.7µs ± 4%     ~     (p=0.988 n=18+19)
BM_SerializeDescriptor_Proto2                    5.77µs ± 5%  5.71µs ± 5%     ~     (p=0.433 n=17+17)
BM_SerializeDescriptor_Upb                       10.0µs ± 5%  10.1µs ± 7%     ~     (p=0.102 n=19+16)

name                                           old time/op             new time/op             delta
BM_ArenaOneAlloc                                 18.4ns ± 6%             18.8ns ± 4%   +1.98%        (p=0.019 n=18+18)
BM_ArenaInitialBlockOneAlloc                     5.51ns ± 4%             6.58ns ± 4%  +19.42%        (p=0.000 n=16+17)
BM_ArenaFuseUnbalanced/2                         59.5ns ±10%             68.9ns ± 4%  +15.83%        (p=0.000 n=19+19)
BM_ArenaFuseUnbalanced/8                          481ns ± 5%              541ns ± 8%  +12.54%        (p=0.000 n=18+19)
BM_ArenaFuseUnbalanced/64                        4.51µs ± 4%             4.94µs ± 8%   +9.53%        (p=0.000 n=17+17)
BM_ArenaFuseUnbalanced/128                       9.26µs ± 3%             9.98µs ± 3%   +7.79%        (p=0.000 n=17+17)
BM_ArenaFuseBalanced/2                           63.5ns ±19%             71.1ns ± 3%  +12.07%        (p=0.000 n=19+18)
BM_ArenaFuseBalanced/8                            485ns ± 9%              551ns ±20%  +13.47%        (p=0.000 n=17+17)
BM_ArenaFuseBalanced/64                          4.51µs ± 6%             4.95µs ± 4%   +9.62%        (p=0.000 n=19+17)
BM_ArenaFuseBalanced/128                         9.22µs ± 4%             9.97µs ± 4%   +8.12%        (p=0.000 n=16+19)
BM_LoadAdsDescriptor_Upb<NoLayout>               5.52ms ± 8%             5.72ms ±18%     ~           (p=0.199 n=18+19)
BM_LoadAdsDescriptor_Upb<WithLayout>             6.12ms ± 5%             6.07ms ± 4%     ~           (p=0.273 n=17+18)
BM_LoadAdsDescriptor_Proto2<NoLayout>            11.9ms ±15%             11.6ms ± 5%     ~           (p=0.589 n=19+16)
BM_LoadAdsDescriptor_Proto2<WithLayout>          11.9ms ± 5%             12.5ms ±18%     ~           (p=0.582 n=16+20)
BM_Parse_Upb_FileDesc<UseArena, Copy>            12.2µs ± 8%             12.1µs ± 3%     ~           (p=0.963 n=18+18)
BM_Parse_Upb_FileDesc<UseArena, Alias>           11.8µs ±17%             11.1µs ± 4%     ~           (p=0.104 n=20+17)
BM_Parse_Upb_FileDesc<InitBlock, Copy>           12.0µs ± 5%             11.9µs ± 4%     ~           (p=0.126 n=18+19)
BM_Parse_Upb_FileDesc<InitBlock, Alias>          11.0µs ± 6%             11.1µs ± 4%     ~           (p=0.195 n=17+18)
BM_Parse_Proto2<FileDesc, NoArena, Copy>         24.3µs ± 4%             24.5µs ± 6%     ~           (p=0.743 n=18+18)
BM_Parse_Proto2<FileDesc, UseArena, Copy>        11.7µs ± 5%             11.6µs ± 4%     ~           (p=0.574 n=18+16)
BM_Parse_Proto2<FileDesc, InitBlock, Copy>       11.3µs ± 3%             11.3µs ± 3%     ~           (p=0.743 n=18+18)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias>    12.7µs ± 8%             12.7µs ± 4%     ~           (p=0.988 n=18+19)
BM_SerializeDescriptor_Proto2                    5.78µs ± 5%             5.73µs ± 5%     ~           (p=0.357 n=17+17)
BM_SerializeDescriptor_Upb                       10.0µs ± 5%             10.1µs ± 7%     ~           (p=0.117 n=19+16)

name                                           old allocs/op           new allocs/op           delta
BM_ArenaOneAlloc                                   1.00 ± 0%               1.00 ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/2                           2.00 ± 0%               2.00 ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/8                           8.00 ± 0%               8.00 ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/64                          64.0 ± 0%               64.0 ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/128                          128 ± 0%                128 ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/2                             2.00 ± 0%               2.00 ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/8                             8.00 ± 0%               8.00 ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/64                            64.0 ± 0%               64.0 ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/128                            128 ± 0%                128 ± 0%     ~     (all samples are equal)
BM_LoadAdsDescriptor_Upb<NoLayout>                6.08k ± 0%              6.05k ± 0%   -0.54%        (p=0.000 n=20+20)
BM_LoadAdsDescriptor_Upb<WithLayout>              6.39k ± 0%              6.36k ± 0%   -0.55%        (p=0.000 n=20+20)
BM_LoadAdsDescriptor_Proto2<NoLayout>             83.4k ± 0%              83.4k ± 0%     ~           (p=0.800 n=20+20)
BM_LoadAdsDescriptor_Proto2<WithLayout>           84.4k ± 0%              84.4k ± 0%     ~           (p=0.752 n=20+20)
BM_Parse_Upb_FileDesc<UseArena, Copy>              7.00 ± 0%               7.00 ± 0%     ~     (all samples are equal)
BM_Parse_Upb_FileDesc<UseArena, Alias>             7.00 ± 0%               7.00 ± 0%     ~     (all samples are equal)
BM_Parse_Proto2<FileDesc, NoArena, Copy>            765 ± 0%                765 ± 0%     ~     (all samples are equal)
BM_Parse_Proto2<FileDesc, UseArena, Copy>          8.00 ± 0%               8.00 ± 0%     ~     (all samples are equal)

name                                           old peak-mem(Bytes)/op  new peak-mem(Bytes)/op  delta
BM_ArenaOneAlloc                                    336 ± 0%                336 ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/2                            672 ± 0%                672 ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/8                          2.69k ± 0%              2.69k ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/64                         21.5k ± 0%              21.5k ± 0%     ~     (all samples are equal)
BM_ArenaFuseUnbalanced/128                        43.0k ± 0%              43.0k ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/2                              672 ± 0%                672 ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/8                            2.69k ± 0%              2.69k ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/64                           21.5k ± 0%              21.5k ± 0%     ~     (all samples are equal)
BM_ArenaFuseBalanced/128                          43.0k ± 0%              43.0k ± 0%     ~     (all samples are equal)
BM_LoadAdsDescriptor_Upb<NoLayout>                9.89M ± 0%              9.95M ± 0%   +0.65%        (p=0.000 n=20+20)
BM_LoadAdsDescriptor_Upb<WithLayout>              9.95M ± 0%             10.02M ± 0%   +0.70%        (p=0.000 n=20+20)
BM_LoadAdsDescriptor_Proto2<NoLayout>             6.62M ± 0%              6.62M ± 0%     ~           (p=0.800 n=20+20)
BM_LoadAdsDescriptor_Proto2<WithLayout>           6.66M ± 0%              6.66M ± 0%     ~           (p=0.752 n=20+20)
BM_Parse_Upb_FileDesc<UseArena, Copy>             36.5k ± 0%              36.5k ± 0%     ~     (all samples are equal)
BM_Parse_Upb_FileDesc<UseArena, Alias>            36.5k ± 0%              36.5k ± 0%     ~     (all samples are equal)
BM_Parse_Proto2<FileDesc, NoArena, Copy>          35.8k ± 0%              35.8k ± 0%     ~     (all samples are equal)
BM_Parse_Proto2<FileDesc, UseArena, Copy>         65.3k ± 0%              65.3k ± 0%     ~     (all samples are equal)

name                                           old speed               new speed               delta
BM_LoadAdsDescriptor_Upb<NoLayout>              138MB/s ± 7%            132MB/s ±15%     ~           (p=0.126 n=18+20)
BM_LoadAdsDescriptor_Upb<WithLayout>            124MB/s ± 5%            125MB/s ± 4%     ~           (p=0.258 n=17+18)
BM_LoadAdsDescriptor_Proto2<NoLayout>          63.9MB/s ±13%           65.2MB/s ± 5%     ~           (p=0.589 n=19+16)
BM_LoadAdsDescriptor_Proto2<WithLayout>        64.0MB/s ± 5%           61.3MB/s ±15%     ~           (p=0.604 n=16+20)
BM_Parse_Upb_FileDesc<UseArena, Copy>           620MB/s ± 8%            622MB/s ± 4%     ~           (p=1.000 n=18+18)
BM_Parse_Upb_FileDesc<UseArena, Alias>          644MB/s ±15%            679MB/s ± 4%     ~           (p=0.104 n=20+17)
BM_Parse_Upb_FileDesc<InitBlock, Copy>          627MB/s ± 4%            633MB/s ± 4%     ~           (p=0.134 n=18+19)
BM_Parse_Upb_FileDesc<InitBlock, Alias>         688MB/s ± 6%            682MB/s ± 4%     ~           (p=0.195 n=17+18)
BM_Parse_Proto2<FileDesc, NoArena, Copy>        310MB/s ± 4%            309MB/s ± 6%     ~           (p=0.767 n=18+18)
BM_Parse_Proto2<FileDesc, UseArena, Copy>       646MB/s ± 4%            649MB/s ± 4%     ~           (p=0.621 n=18+16)
BM_Parse_Proto2<FileDesc, InitBlock, Copy>      666MB/s ± 3%            666MB/s ± 3%     ~           (p=0.743 n=18+18)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias>   592MB/s ± 7%            593MB/s ± 4%     ~           (p=0.988 n=18+19)
BM_SerializeDescriptor_Proto2                  1.30GB/s ± 5%           1.32GB/s ± 5%     ~           (p=0.433 n=17+17)
BM_SerializeDescriptor_Upb                      756MB/s ± 5%            745MB/s ± 6%     ~           (p=0.102 n=19+16)
```

PiperOrigin-RevId: 520144430
pull/13171/head
Joshua Haberman 2 years ago committed by Copybara-Service
parent bc7795702e
commit a73852630f
  1. 121
      upb/mem/arena.c
  2. 16
      upb/mem/arena_internal.h
  3. 25
      upb/port/atomic.h
  4. 4
      upb/port/def.inc
  5. 44
      upb/wire/decode.c

@ -45,7 +45,8 @@ static uintptr_t upb_cleanup_metadata(uint32_t* cleanup,
}
struct _upb_MemBlock {
struct _upb_MemBlock* next;
// Atomic only for the benefit of SpaceAllocated().
UPB_ATOMIC(_upb_MemBlock*) next;
uint32_t size;
uint32_t cleanups;
// Data follows.
@ -98,11 +99,13 @@ size_t upb_Arena_SpaceAllocated(upb_Arena* arena) {
arena = _upb_Arena_FindRoot(arena);
size_t memsize = 0;
_upb_MemBlock* block = arena->freelist;
while (block) {
memsize += sizeof(_upb_MemBlock) + block->size;
block = block->next;
while (arena != NULL) {
_upb_MemBlock* block = arena->blocks;
while (block != NULL) {
memsize += sizeof(_upb_MemBlock) + block->size;
block = upb_Atomic_Load(&block->next, memory_order_relaxed);
}
arena = upb_Atomic_Load(&arena->next, memory_order_relaxed);
}
return memsize;
@ -119,17 +122,14 @@ uint32_t upb_Arena_DebugRefCount(upb_Arena* a) {
return _upb_Arena_RefCountFromTagged(poc);
}
static void upb_Arena_addblock(upb_Arena* a, upb_Arena* root, void* ptr,
size_t size) {
static void upb_Arena_AddBlock(upb_Arena* a, void* ptr, size_t size) {
_upb_MemBlock* block = ptr;
/* The block is for arena |a|, but should appear in the freelist of |root|. */
block->next = root->freelist;
// Insert into linked list.
upb_Atomic_Init(&block->next, a->blocks);
block->size = (uint32_t)size;
block->cleanups = 0;
root->freelist = block;
a->last_size = block->size;
if (!root->freelist_tail) root->freelist_tail = block;
upb_Atomic_Store(&a->blocks, block, memory_order_relaxed);
a->head.ptr = UPB_PTR_AT(block, memblock_reserve, char);
a->head.end = UPB_PTR_AT(block, size, char);
@ -139,18 +139,20 @@ static void upb_Arena_addblock(upb_Arena* a, upb_Arena* root, void* ptr,
UPB_POISON_MEMORY_REGION(a->head.ptr, a->head.end - a->head.ptr);
}
static bool upb_Arena_Allocblock(upb_Arena* a, size_t size) {
upb_Arena* root = _upb_Arena_FindRoot(a);
size_t block_size = UPB_MAX(size, a->last_size * 2) + memblock_reserve;
_upb_MemBlock* block = upb_malloc(root->block_alloc, block_size);
static bool upb_Arena_AllocBlock(upb_Arena* a, size_t size) {
if (!a->block_alloc) return false;
_upb_MemBlock* last_block = upb_Atomic_Load(&a->blocks, memory_order_relaxed);
size_t last_size = last_block != NULL ? last_block->size : 128;
size_t block_size = UPB_MAX(size, last_size * 2) + memblock_reserve;
_upb_MemBlock* block = upb_malloc(a->block_alloc, block_size);
if (!block) return false;
upb_Arena_addblock(a, root, block, block_size);
upb_Arena_AddBlock(a, block, block_size);
return true;
}
void* _upb_Arena_SlowMalloc(upb_Arena* a, size_t size) {
if (!upb_Arena_Allocblock(a, size)) return NULL; /* Out of memory. */
if (!upb_Arena_AllocBlock(a, size)) return NULL; /* Out of memory. */
UPB_ASSERT(_upb_ArenaHas(a) >= size);
return upb_Arena_Malloc(a, size);
}
@ -172,11 +174,12 @@ static upb_Arena* arena_initslow(void* mem, size_t n, upb_alloc* alloc) {
a->block_alloc = alloc;
upb_Atomic_Init(&a->parent_or_count, _upb_Arena_TaggedFromRefcount(1));
a->freelist = NULL;
a->freelist_tail = NULL;
upb_Atomic_Init(&a->next, NULL);
upb_Atomic_Init(&a->tail, a);
upb_Atomic_Init(&a->blocks, NULL);
a->cleanup_metadata = upb_cleanup_metadata(NULL, false);
upb_Arena_addblock(a, a, mem, n);
upb_Arena_AddBlock(a, mem, n);
return a;
}
@ -202,37 +205,44 @@ upb_Arena* upb_Arena_Init(void* mem, size_t n, upb_alloc* alloc) {
a = UPB_PTR_AT(mem, n - sizeof(*a), upb_Arena);
a->block_alloc = alloc;
upb_Atomic_Init(&a->parent_or_count, _upb_Arena_TaggedFromRefcount(1));
a->last_size = UPB_MAX(128, n);
upb_Atomic_Init(&a->next, NULL);
upb_Atomic_Init(&a->tail, a);
upb_Atomic_Init(&a->blocks, NULL);
a->block_alloc = alloc;
a->head.ptr = mem;
a->head.end = UPB_PTR_AT(mem, n - sizeof(*a), char);
a->freelist = NULL;
a->freelist_tail = NULL;
a->cleanup_metadata = upb_cleanup_metadata(NULL, true);
return a;
}
static void arena_dofree(upb_Arena* a) {
_upb_MemBlock* block = a->freelist;
UPB_ASSERT(_upb_Arena_RefCountFromTagged(a->parent_or_count) == 1);
while (block) {
/* Load first since we are deleting block. */
_upb_MemBlock* next = block->next;
if (block->cleanups > 0) {
cleanup_ent* end = UPB_PTR_AT(block, block->size, void);
cleanup_ent* ptr = end - block->cleanups;
for (; ptr < end; ptr++) {
ptr->cleanup(ptr->ud);
while (a != NULL) {
// Load first since arena itself is likely from one of its blocks.
upb_Arena* next_arena =
(upb_Arena*)upb_Atomic_Load(&a->next, memory_order_acquire);
_upb_MemBlock* block = upb_Atomic_Load(&a->blocks, memory_order_relaxed);
while (block != NULL) {
/* Load first since we are deleting block. */
_upb_MemBlock* next_block =
upb_Atomic_Load(&block->next, memory_order_relaxed);
if (block->cleanups > 0) {
cleanup_ent* end = UPB_PTR_AT(block, block->size, void);
cleanup_ent* ptr = end - block->cleanups;
for (; ptr < end; ptr++) {
ptr->cleanup(ptr->ud);
}
}
}
upb_free(a->block_alloc, block);
block = next;
upb_free(a->block_alloc, block);
block = next_block;
}
a = next_arena;
}
}
@ -270,7 +280,7 @@ bool upb_Arena_AddCleanup(upb_Arena* a, void* ud, upb_CleanupFunc* func) {
uint32_t* cleanups = upb_cleanup_pointer(a->cleanup_metadata);
if (!cleanups || _upb_ArenaHas(a) < sizeof(cleanup_ent)) {
if (!upb_Arena_Allocblock(a, 128)) return false; /* Out of memory. */
if (!upb_Arena_AllocBlock(a, 128)) return false; /* Out of memory. */
UPB_ASSERT(_upb_ArenaHas(a) >= sizeof(cleanup_ent));
cleanups = upb_cleanup_pointer(a->cleanup_metadata);
}
@ -339,14 +349,6 @@ bool upb_Arena_Fuse(upb_Arena* a1, upb_Arena* a2) {
r2_poc = tmp_poc;
}
// r1 takes over r2's freelist, this must happen before we update
// refcounts since the refcount carriers the memory dependencies.
if (r2->freelist_tail) {
UPB_ASSERT(r2->freelist_tail->next == NULL);
r2->freelist_tail->next = r1->freelist;
r1->freelist = r2->freelist;
}
// The moment we install `r1` as the parent for `r2` all racing frees may
// immediately begin decrementing `r1`'s refcount. So we must install all the
// refcounts that we know about first to prevent a premature unref to zero.
@ -366,5 +368,26 @@ bool upb_Arena_Fuse(upb_Arena* a1, upb_Arena* a2) {
upb_Atomic_Sub(&r1->parent_or_count, ((uintptr_t)delta_refcount) << 1,
memory_order_release);
}
// Now append r2's linked list of arenas to r1's.
upb_Arena* r2_tail = upb_Atomic_Load(&r2->tail, memory_order_relaxed);
upb_Arena* r1_tail = upb_Atomic_Load(&r1->tail, memory_order_relaxed);
upb_Arena* r1_next = upb_Atomic_Load(&r1_tail->next, memory_order_relaxed);
while (r1_next != NULL) {
// r1->tail was stale. This can happen, but tail should always converge on
// the true tail.
r1_tail = r1_next;
r1_next = upb_Atomic_Load(&r1_tail->next, memory_order_relaxed);
}
upb_Arena* old_next =
upb_Atomic_Exchange(&r1_tail->next, r2, memory_order_relaxed);
// Once fuse/fuse races are allowed, it will need to be a CAS instead that
// handles this mismatch gracefully.
UPB_ASSERT(old_next == NULL);
upb_Atomic_Store(&r1->tail, r2_tail, memory_order_relaxed);
return true;
}

@ -45,7 +45,6 @@ struct upb_Arena {
/* Allocator to allocate arena blocks. We are responsible for freeing these
* when we are destroyed. */
upb_alloc* block_alloc;
uint32_t last_size;
/* When multiple arenas are fused together, each arena points to a parent
* arena (root points to itself). The root tracks how many live arenas
@ -55,10 +54,19 @@ struct upb_Arena {
* 0: pointer to parent
* 1: count, left shifted by one
*/
UPB_ATOMIC uintptr_t parent_or_count;
UPB_ATOMIC(uintptr_t) parent_or_count;
/* Linked list of blocks to free/cleanup. */
_upb_MemBlock *freelist, *freelist_tail;
// All nodes that are fused together are in a singly-linked list.
UPB_ATOMIC(upb_Arena*) next; // NULL at end of list.
// The last element of the linked list. This is present only as an
// optimization, so that we do not have to iterate over all members for every
// fuse. Only significant for an arena root. In other cases it is ignored.
UPB_ATOMIC(upb_Arena*) tail; // == self when no other list members.
// Linked list of blocks to free/cleanup. Atomic only for the benefit of
// upb_Arena_SpaceAllocated().
UPB_ATOMIC(_upb_MemBlock*) blocks;
};
UPB_INLINE bool _upb_Arena_IsTaggedRefcount(uintptr_t parent_or_count) {

@ -66,9 +66,7 @@ UPB_INLINE uintptr_t _upb_NonAtomic_ExchangeU(uintptr_t* addr, uintptr_t val) {
return ret;
}
// `addr` should logically be `void**`, but `void*` allows for more convenient
// implicit conversions.
UPB_INLINE void* _upb_NonAtomic_ExchangeP(void* addr, void* val) {
UPB_INLINE void* _upb_NonAtomic_ExchangeP(upb_Arena** addr, upb_Arena* val) {
void* ret;
memcpy(&ret, addr, sizeof(val));
memcpy(addr, &val, sizeof(val));
@ -78,7 +76,7 @@ UPB_INLINE void* _upb_NonAtomic_ExchangeP(void* addr, void* val) {
#define upb_Atomic_Exchange(addr, val, order) \
_Generic((val), \
uintptr_t: _upb_NonAtomic_ExchangeU, \
void*: _upb_NonAtomic_ExchangeP)(addr, val)
upb_Arena *: _upb_NonAtomic_ExchangeP)(addr, val)
UPB_INLINE bool _upb_NonAtomic_CompareExchangeStrongU(uintptr_t* addr,
uintptr_t* expected,
@ -92,11 +90,9 @@ UPB_INLINE bool _upb_NonAtomic_CompareExchangeStrongU(uintptr_t* addr,
}
}
// `addr` and `expected` should logically be `void**`, but `void*` allows for
// more convenient implicit conversions.
UPB_INLINE bool _upb_NonAtomic_CompareExchangeStrongP(void* addr,
void* expected,
void* desired) {
UPB_INLINE bool _upb_NonAtomic_CompareExchangeStrongP(upb_Arena** addr,
upb_Arena** expected,
upb_Arena* desired) {
if (memcmp(addr, expected, sizeof(desired)) == 0) {
memcpy(addr, &desired, sizeof(desired));
return true;
@ -106,11 +102,12 @@ UPB_INLINE bool _upb_NonAtomic_CompareExchangeStrongP(void* addr,
}
}
#define upb_Atomic_CompareExchangeStrong(addr, expected, desired, \
success_order, failure_order) \
_Generic((desired), \
uintptr_t: _upb_NonAtomic_CompareExchangeStrongU, \
void*: _upb_NonAtomic_CompareExchangeStrongP)(addr, expected, desired)
#define upb_Atomic_CompareExchangeStrong(addr, expected, desired, \
success_order, failure_order) \
_Generic((desired), \
uintptr_t: _upb_NonAtomic_CompareExchangeStrongU, \
upb_Arena *: _upb_NonAtomic_CompareExchangeStrongP)(addr, expected, \
desired)
#endif

@ -192,9 +192,9 @@
#ifdef __GNUC__
#define UPB_USE_C11_ATOMICS
#define UPB_ATOMIC _Atomic
#define UPB_ATOMIC(T) _Atomic(T)
#else
#define UPB_ATOMIC
#define UPB_ATOMIC(T) T
#endif
/* UPB_PTRADD(ptr, ofs): add pointer while avoiding "NULL + 0" UB */

@ -1282,9 +1282,11 @@ static upb_DecodeStatus upb_Decoder_Decode(upb_Decoder* const decoder,
UPB_ASSERT(decoder->status != kUpb_DecodeStatus_Ok);
}
arena->head.ptr = decoder->arena.head.ptr;
arena->head.end = decoder->arena.head.end;
_upb_MemBlock* blocks =
upb_Atomic_Load(&decoder->arena.blocks, memory_order_relaxed);
arena->head = decoder->arena.head;
arena->cleanup_metadata = decoder->arena.cleanup_metadata;
upb_Atomic_Store(&arena->blocks, blocks, memory_order_relaxed);
return decoder->status;
}
@ -1292,26 +1294,32 @@ upb_DecodeStatus upb_Decode(const char* buf, size_t size, void* msg,
const upb_MiniTable* l,
const upb_ExtensionRegistry* extreg, int options,
upb_Arena* arena) {
upb_Decoder state;
upb_Decoder decoder;
unsigned depth = (unsigned)options >> 16;
upb_EpsCopyInputStream_Init(&state.input, &buf, size,
upb_EpsCopyInputStream_Init(&decoder.input, &buf, size,
options & kUpb_DecodeOption_AliasString);
state.extreg = extreg;
state.unknown = NULL;
state.depth = depth ? depth : 64;
state.end_group = DECODE_NOGROUP;
state.options = (uint16_t)options;
state.missing_required = false;
state.arena.head = arena->head;
state.arena.last_size = arena->last_size;
state.arena.cleanup_metadata = arena->cleanup_metadata;
upb_Atomic_Init(&state.arena.parent_or_count,
_upb_Arena_TaggedFromPointer(arena));
state.status = kUpb_DecodeStatus_Ok;
return upb_Decoder_Decode(&state, buf, msg, l, arena);
decoder.extreg = extreg;
decoder.unknown = NULL;
decoder.depth = depth ? depth : 64;
decoder.end_group = DECODE_NOGROUP;
decoder.options = (uint16_t)options;
decoder.missing_required = false;
decoder.status = kUpb_DecodeStatus_Ok;
// Violating the encapsulation of the arena for performance reasons.
// This is a temporary arena that we swap into and swap out of when we are
// done. The temporary arena only needs to be able to handle allocation,
// not fuse or free, so it does not need many of the members to be initialized
// (particularly parent_or_count).
_upb_MemBlock* blocks = upb_Atomic_Load(&arena->blocks, memory_order_relaxed);
decoder.arena.head = arena->head;
decoder.arena.block_alloc = arena->block_alloc;
decoder.arena.cleanup_metadata = arena->cleanup_metadata;
upb_Atomic_Init(&decoder.arena.blocks, blocks);
return upb_Decoder_Decode(&decoder, buf, msg, l, arena);
}
#undef OP_FIXPCK_LG2

Loading…
Cancel
Save