_upb_Arena_FastMalloc() has been directly inlined
Arena::allocator() has been removed
mem_block is now _upb_MemBlock
some calls to upb_Arena_Alloc() (which is going away) have been changed to use upb_Arena_Malloc() instead
PiperOrigin-RevId: 489874377
Ref: https://github.com/protocolbuffers/protobuf/pull/10291
Ruby types defined though native extensions should register
a function that report their memory footprint in bytes.
This feature is used by various memory profiling tools.
some headers were not including port_def.inc
some headers were not declaring extern "C"
some headers were backing out of the above in the wrong order
PiperOrigin-RevId: 457391878
Lots of changes but it's all just moving things around.
Backward-compatible stub #include's have been provided for now.
upb_Arena/upb_Status have been split out from upb/upb.?
upb_Array/upb_Map/upb_MessageValue have been split out from upb/collections.?
upb_ExtensionRegistry has been split out from upb/msg.?
upb/decode_internal.h is now upb/internal/decode.h
upb/mini_table_accessors_internal.h is now upb/internal/mini_table_accessors.h
upb/table_internal.h is now upb/internal/table.h
upb/upb_internal.h is now upb/internal/upb.h
PiperOrigin-RevId: 456297617
An enum MiniDescriptor simply encodes a set of valid `int32_t` values, so that the protobuf parser can test whether a given enum value is known or not.
The format implemented here is novel and needs to be documented. In short, the format is:
1. base92 values 0-31: 5-bit mask indicating presence or absence of the next five enum values.
2. base92 values 60-91: varint indicating skip over a region of enum values.
Negative enum values are encoded as their `uint32_t` equivalent.
PiperOrigin-RevId: 442892799
Unfortunately a few of the Clang warnings did not have easy fixes:
../../../../ext/google/protobuf_c/ruby-upb.c: In function ‘fastdecode_err’:
../../../../ext/google/protobuf_c/ruby-upb.c:353:13: warning: function might be candidate for attribute ‘noreturn’ [-Wsuggest-attribute=noreturn]
353 | const char *fastdecode_err(upb_decstate *d) {
| ^~~~~~~~~~~~~~
../../../../ext/google/protobuf_c/ruby-upb.c: In function ‘_upb_decode’:
../../../../ext/google/protobuf_c/ruby-upb.c:867:30: warning: argument ‘buf’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered]
867 | bool _upb_decode(const char *buf, size_t size, void *msg,
I even tried to suppress the first error, but it still shows up.
There was a bug in our arena code where we assumed that
sizeof(upb_array) would be a multiple of 8. On i386 it was
not, and this was causing memory corruption on 32-bit builds.
* Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors.
* Disable -Wshorten-32-to-64 since Kokoro is missing Clang.
* Fixed -Wextra warnings for gcc.
* Reordered UPB_UNUSED() to come after declarations.
* Added another -pedantic fix and log CC version.
* Fix compile error and conditionally run use_bazel.sh.
* Moved set -e after use_bazel.sh.
* Fixed typo in conditional.
* WIP.
* WIP.
* Tests are passing.
* Recover some perf: LIKELY doesn't propagate through functions. :(
* Added some more benchmarks.
* Simplify & optimize upb_arena_realloc().
* Only add owned blocks to the freelist.
* More optimization/simplification.
* Re-fixed the bug.
* Revert unintentional changes to parser.rl.
* Revert Lua changes for now.
* Revert the arena fuse changes for now.
* Added last_size to the arena representation.
* Re-applied Lua changes.
* Implemented upb_arena_fuse().
* Fix the compile by re-ordering statements.
* Improve comments.
* WIP.
* WIP.
* Tests are passing.
* Recover some perf: LIKELY doesn't propagate through functions. :(
* Added some more benchmarks.
* Simplify & optimize upb_arena_realloc().
* Only add owned blocks to the freelist.
* More optimization/simplification.
* Re-fixed the bug.
* Revert unintentional changes to parser.rl.
* Revert Lua changes for now.
* Revert the arena fuse changes for now.
* Added last_size to the arena representation.
* Fixed compile errors.
* Fixed compile error and changed benchmarks to do one allocation.
This makes both the C (.h) and C++ (.hpp) files read nicer
and keeps the core of upb C-only.
Existing users of the C++ wrappers will have to add manual
#includes of the .hpp files.
New code is smaller (in both source size and compiled size) and faster.
# Speed
The decoder speeds up on all machines I tested, though the amount of speedup varies. I was only able to test Intel CPUs.
### Linux Desktop
```
CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
OS: Linux
name old time/op new time/op delta
CreateArena 4.72ns ± 0% 4.93ns ± 0% +4.47% (p=0.000 n=11+11)
ParseDescriptor 12.4µs ± 1% 9.1µs ± 1% -26.65% (p=0.000 n=11+11)
```
### Mac Laptop
```
CPU: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
OS: macOS
name old time/op new time/op delta
CreateArena 5.33ns ± 3% 5.58ns ± 2% +4.69% (p=0.000 n=12+12)
ParseDescriptor 15.0µs ± 2% 11.9µs ± 2% -20.20% (p=0.000 n=12+12)
```
### Linux Workstation
```
CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
OS: Linux
name old time/op new time/op delta
CreateArena 5.29ns ± 0% 5.52ns ± 0% +4.37% (p=0.000 n=10+12)
ParseDescriptor 18.6µs ± 0% 16.4µs ± 0% -11.54% (p=0.000 n=12+12)
```
# Size
A few source files grow marginally because of some arena functionality moved inline. But `upb/decode.c` shrinks by 30% on Linux:
```
VM SIZE
--------------
+2.1% +283 upb/json_decode.c
+24% +205 upb/msg.c
+8.4% +115 upb/upb.c
+0.9% +28 upb/reflection.c
[ = ] 0 upb/def.c
[ = ] 0 upb/encode.c
[ = ] 0 upb/json_encode.c
[ = ] 0 upb/table.c
-30.3% -1.51Ki upb/decode.c
-0.7% -738 TOTAL
```