Eric Salo
111249d085
continue preparing def.c for the big split:
...
- opt_default is now kUpbDefOptDefault
- upb_AddDefCtx is now upb_DefBuilder
- shortdefname() is now _upb_DefBuilder_FullToShort()
- pack_def() is now _upb_DefBuilder_Pack()
- unpack_def() is now _upb_DefBuilder_Unpack()
- UPB_ASSERT() checks on def struct size moved up in the call chain
- remove/expand CHK_OOM
PiperOrigin-RevId: 471862679
2 years ago
Eric Salo
b987ef6249
use _upb_DefBuilder instead of _upb_AddDexCtx
...
(Because it is a better name and we can't properly bikeshed it without changing
it a few times first.)
Also removed it as an arg from a function that doesn't actually need it.
PiperOrigin-RevId: 470978322
2 years ago
Eric Salo
02ef81247a
rename some symtab functions to follow the upb style guide:
...
- check_ident() is now _upb_AddDefCtx_CheckIdent()
- makefullname() is now _upb_AddDefCtx_MakeFullName()
- symtab_add() is now _upb_AddDefCtx_Add()
- symtab_alloc() is now _upb_AddDefCtx_Alloc()
- symtab_errf() is now _upb_AddDefCtx_Errf()
- symtab_oomerr() is now _upb_AddDefCtx_OomErr()
PiperOrigin-RevId: 470806376
2 years ago
Eric Salo
250321e63b
more misc tweaks to def.c:
...
- renamed symtab_addctx as upb_AddDefCtx
- simplified _upb_DefPool_AddFile() slightly
- replaced upb_Status_setoom() with kOutOfMemory
- deleted all references to UPB_DEFTYPE_LAYOUT and UPB_DEFTYPE_FILE
PiperOrigin-RevId: 470592649
2 years ago
Joshua Haberman
15b2402144
Removed unused `_upb_DefPool_registerlayout()` function.
...
This function was introduced in https://github.com/protocolbuffers/upb/pull/426 but it appears it was never used. I am not sure what the purpose was, but in any case it is not needed.
With this function removed, we no longer need to tag pointers for the DefPool "files" table.
PiperOrigin-RevId: 470567000
2 years ago
Eric Salo
daac8dce8f
add more constructors to def.c
...
PiperOrigin-RevId: 470566093
2 years ago
Eric Salo
11aa037bfd
add some more accessor calls to def.c
...
PiperOrigin-RevId: 470404251
2 years ago
Protobuf Team Bot
52c9b98692
Resolve field name/accessor name conflicts.
...
PiperOrigin-RevId: 470335824
2 years ago
Eric Salo
5b46a55a46
try to be better about using accessors in def.c
...
There are many places within def.c where a function which implements a method
for struct A ends up directly accessing fields within struct B even though
accessor functions for these fields are defined. So, this is a first pass at
trying to clean that up a bit. Yes, we are adding function calls to a lot of
code paths by doing this but in the unlikely event that this adds unacceptable
overhead we can deal with it then.
PiperOrigin-RevId: 470321129
2 years ago
Eric Salo
1135746e42
start replumbing def.c
...
Give the def types their own array allocators and also implement some simple array constructors for the enum and enum value defs.
PiperOrigin-RevId: 470042521
2 years ago
Protobuf Team Bot
ce32d9d68f
Fix code generation for infinity default value on float/double fields.
...
PiperOrigin-RevId: 469843901
2 years ago
Eric Salo
f3316e2d7d
remove upb_String from the public tokenizer api
...
upb_String is a hack which exists because the original C++ tokenizer got to
assume the existence of C++ strings, so at least for now the C tokenizer needs
a rough equivalent. But this should be a purely internal implementation detail,
not part of the visible surface.
PiperOrigin-RevId: 469814074
2 years ago
Eric Salo
0013c936ef
add upb_Status to the tokenizer
...
PiperOrigin-RevId: 469721241
2 years ago
Eric Salo
c67021f84a
split out the json string-to-int functions for general use
...
PiperOrigin-RevId: 469509635
2 years ago
Eric Salo
33114209dc
simplify the tokenizer
...
- remove previous token from the public api
- remove upb_Token type
PiperOrigin-RevId: 469308543
2 years ago
Eric Salo
922a858e5c
clean up tokenizer options and defaults
...
- Disallow multiline strings.
- Disallow a letter immediately following a number without intervening whitespace.
- Replace distinct bool option flags with a single options int.
PiperOrigin-RevId: 467829817
2 years ago
Eric Salo
6861966501
first stab at a Tokenizer api
...
These functions are not yet part of the upb build but this is a good chunk of work so let's snapshot it now.
PiperOrigin-RevId: 467733791
2 years ago
Protobuf Team Bot
0c6531378d
Merge GetEnum into GetInt32. Rename SetEnum to SetEnumProto2 to be clear that upb only treats Proto2 enum as enum. Proto3 enums should use SetInt32.
...
PiperOrigin-RevId: 467000685
2 years ago
Eric Salo
3f4f7ab079
properly format extension names in text_encode()
...
extension message names are now enclosed within square brackets
PiperOrigin-RevId: 466499355
2 years ago
Protobuf Team Bot
e09d6fcb6d
Update mini table API comment
...
PiperOrigin-RevId: 463868386
2 years ago
Protobuf Team Bot
f034bba2ed
fixed formatting and parsing of negative durations between -1s and 0s
...
PiperOrigin-RevId: 462142321
2 years ago
Joshua Haberman
fcb5ef37f7
Fixed a bug in MiniTable construction for extensions. #fuzzing
...
We were failing to assign the f->presence field, which resulted in a read of uninitialized memory.
PiperOrigin-RevId: 462138061
2 years ago
Joshua Haberman
ececc21624
Fixed bug when parsing an unknown value in a proto2 enum extension. #fuzzing
...
Proto2 enum parsing is the only case where we have to look at the wire value (not merely the tag) to decide whether the field is known or unknown. If the value is unknown, we need to put the value in the Unknown Fields, but for an extension we no longer have easy access to the message, because for extensions we replace the `msg` pointer with a pointer to the extension. The bug occurred when we were treating the fake `upb_Message*` (which was actually a pointer to an extension) as a real `upb_Message*` that can have unknown fields.
This CL fixes the problem by preserving the true message pointer in `d->unknown_msg` when we are parsing an extension.
This also required fixing a bug in MiniTable building when fasttables are enabled. We need to set the table_mask to `-1` to disable fasttable parsing, not `0`.
For unknown reasons, this CL appears to speed up parsing somewhat significantly. Ideally we should be tracking parsing performance better over time, as it is possible this is merely regaining performance that was lost at a different time:
```
benchy --reference=srcfs third_party/upb/benchmarks:benchmark
10 / 10 [=================================================================================================================] 100.00% 2m32s
(Generated by http://go/benchy . Settings: --runs 5 --reference "srcfs")
name old cpu/op new cpu/op delta
BM_ArenaOneAlloc 23.9ns ± 6% 23.7ns ± 4% ~ (p=0.180 n=53+51)
BM_ArenaInitialBlockOneAlloc 7.62ns ± 4% 7.70ns ± 5% +0.99% (p=0.024 n=59+60)
BM_LoadAdsDescriptor_Upb<NoLayout> 6.60ms ±10% 6.57ms ± 8% ~ (p=0.607 n=47+54)
BM_LoadAdsDescriptor_Upb<WithLayout> 6.92ms ± 5% 6.88ms ± 8% ~ (p=0.257 n=54+54)
BM_LoadAdsDescriptor_Proto2<NoLayout> 14.2ms ± 8% 14.0ms ± 7% -1.38% (p=0.025 n=58+59)
BM_LoadAdsDescriptor_Proto2<WithLayout> 14.3ms ± 8% 14.2ms ± 8% -1.16% (p=0.031 n=58+57)
BM_Parse_Upb_FileDesc<UseArena, Copy> 15.9µs ± 4% 14.6µs ± 4% -7.85% (p=0.000 n=57+59)
BM_Parse_Upb_FileDesc<UseArena, Alias> 14.5µs ± 4% 13.3µs ± 5% -8.50% (p=0.000 n=57+60)
BM_Parse_Upb_FileDesc<InitBlock, Copy> 15.7µs ± 4% 14.4µs ± 5% -7.99% (p=0.000 n=59+60)
BM_Parse_Upb_FileDesc<InitBlock, Alias> 14.2µs ± 5% 13.0µs ± 4% -8.56% (p=0.000 n=57+58)
BM_Parse_Proto2<FileDesc, NoArena, Copy> 26.3µs ± 4% 26.2µs ± 4% ~ (p=0.195 n=55+53)
BM_Parse_Proto2<FileDesc, UseArena, Copy> 13.3µs ± 5% 13.2µs ± 4% ~ (p=0.085 n=59+59)
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 12.9µs ± 4% 12.8µs ± 3% -0.66% (p=0.023 n=60+58)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 10.9µs ± 6% 10.9µs ± 4% ~ (p=0.063 n=59+58)
BM_SerializeDescriptor_Proto2 7.57µs ± 6% 7.62µs ± 6% ~ (p=0.147 n=57+58)
BM_SerializeDescriptor_Upb 12.8µs ± 4% 12.8µs ± 4% ~ (p=0.163 n=59+56)
name old time/op new time/op delta
BM_ArenaOneAlloc 23.9ns ± 5% 23.7ns ± 4% ~ (p=0.172 n=53+51)
BM_ArenaInitialBlockOneAlloc 7.62ns ± 4% 7.70ns ± 5% +1.02% (p=0.017 n=59+60)
BM_LoadAdsDescriptor_Upb<NoLayout> 6.60ms ±10% 6.58ms ± 8% ~ (p=0.727 n=47+55)
BM_LoadAdsDescriptor_Upb<WithLayout> 6.92ms ± 5% 6.88ms ± 8% ~ (p=0.260 n=54+54)
BM_LoadAdsDescriptor_Proto2<NoLayout> 14.2ms ± 7% 14.0ms ± 7% -1.40% (p=0.019 n=58+59)
BM_LoadAdsDescriptor_Proto2<WithLayout> 14.3ms ± 8% 14.2ms ± 8% -1.13% (p=0.037 n=58+57)
BM_Parse_Upb_FileDesc<UseArena, Copy> 15.9µs ± 4% 14.6µs ± 3% -7.88% (p=0.000 n=57+59)
BM_Parse_Upb_FileDesc<UseArena, Alias> 14.5µs ± 4% 13.3µs ± 5% -8.46% (p=0.000 n=57+60)
BM_Parse_Upb_FileDesc<InitBlock, Copy> 15.7µs ± 4% 14.4µs ± 5% -7.99% (p=0.000 n=59+60)
BM_Parse_Upb_FileDesc<InitBlock, Alias> 14.2µs ± 5% 13.0µs ± 4% -8.56% (p=0.000 n=57+58)
BM_Parse_Proto2<FileDesc, NoArena, Copy> 26.3µs ± 4% 26.2µs ± 4% ~ (p=0.224 n=55+53)
BM_Parse_Proto2<FileDesc, UseArena, Copy> 13.3µs ± 5% 13.2µs ± 4% ~ (p=0.098 n=59+59)
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 12.9µs ± 4% 12.8µs ± 3% -0.68% (p=0.015 n=60+58)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 10.9µs ± 6% 10.9µs ± 4% ~ (p=0.052 n=59+58)
BM_SerializeDescriptor_Proto2 7.56µs ± 6% 7.62µs ± 6% ~ (p=0.111 n=58+58)
BM_SerializeDescriptor_Upb 12.8µs ± 4% 12.8µs ± 4% ~ (p=0.241 n=56+56)
name old allocs/op new allocs/op delta
BM_ArenaOneAlloc 1.00 ± 0% 1.00 ± 0% ~ (all samples are equal)
BM_ArenaInitialBlockOneAlloc 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_LoadAdsDescriptor_Upb<NoLayout> 5.98k ± 0% 5.98k ± 0% ~ (all samples are equal)
BM_LoadAdsDescriptor_Upb<WithLayout> 5.98k ± 0% 5.98k ± 0% ~ (all samples are equal)
BM_LoadAdsDescriptor_Proto2<NoLayout> 80.9k ± 0% 80.9k ± 0% ~ (all samples are equal)
BM_LoadAdsDescriptor_Proto2<WithLayout> 82.1k ± 0% 82.1k ± 0% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<UseArena, Copy> 7.00 ± 0% 7.00 ± 0% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<UseArena, Alias> 7.00 ± 0% 7.00 ± 0% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<InitBlock, Copy> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<InitBlock, Alias> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_Parse_Proto2<FileDesc, NoArena, Copy> 765 ± 0% 765 ± 0% ~ (all samples are equal)
BM_Parse_Proto2<FileDesc, UseArena, Copy> 9.00 ± 0% 9.00 ± 0% ~ (all samples are equal)
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_SerializeDescriptor_Proto2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_SerializeDescriptor_Upb 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
name old peak-mem(Bytes)/op new peak-mem(Bytes)/op delta
BM_ArenaOneAlloc 344 ± 0% 344 ± 0% ~ (all samples are equal)
BM_ArenaInitialBlockOneAlloc 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_LoadAdsDescriptor_Upb<NoLayout> 9.60M ± 0% 9.60M ± 0% ~ (all samples are equal)
BM_LoadAdsDescriptor_Upb<WithLayout> 9.68M ± 0% 9.68M ± 0% ~ (all samples are equal)
BM_LoadAdsDescriptor_Proto2<NoLayout> 6.41M ± 0% 6.41M ± 0% ~ (all samples are equal)
BM_LoadAdsDescriptor_Proto2<WithLayout> 6.44M ± 0% 6.44M ± 0% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<UseArena, Copy> 36.5k ± 0% 36.5k ± 0% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<UseArena, Alias> 36.5k ± 0% 36.5k ± 0% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<InitBlock, Copy> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_Parse_Upb_FileDesc<InitBlock, Alias> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_Parse_Proto2<FileDesc, NoArena, Copy> 35.8k ± 0% 35.8k ± 0% ~ (all samples are equal)
BM_Parse_Proto2<FileDesc, UseArena, Copy> 40.7k ± 0% 40.7k ± 0% ~ (all samples are equal)
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_SerializeDescriptor_Proto2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
BM_SerializeDescriptor_Upb 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
name old speed new speed delta
BM_LoadAdsDescriptor_Upb<NoLayout> 113MB/s ± 9% 113MB/s ± 8% ~ (p=0.712 n=47+55)
BM_LoadAdsDescriptor_Upb<WithLayout> 107MB/s ± 8% 108MB/s ± 8% ~ (p=0.200 n=55+54)
BM_LoadAdsDescriptor_Proto2<NoLayout> 52.5MB/s ± 8% 53.3MB/s ± 7% +1.51% (p=0.018 n=59+59)
BM_LoadAdsDescriptor_Proto2<WithLayout> 51.9MB/s ± 7% 52.4MB/s ± 8% +1.01% (p=0.050 n=58+58)
BM_Parse_Upb_FileDesc<UseArena, Copy> 473MB/s ± 4% 514MB/s ± 4% +8.52% (p=0.000 n=57+59)
BM_Parse_Upb_FileDesc<UseArena, Alias> 518MB/s ± 4% 566MB/s ± 5% +9.30% (p=0.000 n=57+60)
BM_Parse_Upb_FileDesc<InitBlock, Copy> 480MB/s ± 4% 521MB/s ± 5% +8.69% (p=0.000 n=59+60)
BM_Parse_Upb_FileDesc<InitBlock, Alias> 528MB/s ± 4% 578MB/s ± 4% +9.36% (p=0.000 n=57+58)
BM_Parse_Proto2<FileDesc, NoArena, Copy> 286MB/s ± 4% 287MB/s ± 4% ~ (p=0.195 n=55+53)
BM_Parse_Proto2<FileDesc, UseArena, Copy> 566MB/s ± 5% 570MB/s ± 4% ~ (p=0.085 n=59+59)
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 583MB/s ± 5% 587MB/s ± 3% +0.64% (p=0.023 n=60+58)
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 688MB/s ± 6% 693MB/s ± 4% ~ (p=0.063 n=59+58)
BM_SerializeDescriptor_Proto2 995MB/s ± 6% 988MB/s ± 5% ~ (p=0.147 n=57+58)
BM_SerializeDescriptor_Upb 586MB/s ± 4% 589MB/s ± 4% ~ (p=0.163 n=59+56)
```
PiperOrigin-RevId: 462022073
2 years ago
Protobuf Team Bot
48d6764490
rolling back to fix some tests
...
PiperOrigin-RevId: 461922243
2 years ago
Protobuf Team Bot
470d6322c9
fixed formatting and parsing of negative durations between -1s and 0s
...
PiperOrigin-RevId: 461663523
2 years ago
Eric Salo
24f567b64a
clean up the public api for mini descriptors
...
push the gory details down into internal/ where they belong
PiperOrigin-RevId: 461264404
2 years ago
Eric Salo
28bc460dc9
create _upb_EnumDef_MiniDescriptor()
...
delete upb_EnumDef_IsSorted()
We now have a simple internal function for returning a mini descriptor directly from an enum def.
PiperOrigin-RevId: 461208352
2 years ago
Joshua Haberman
b0ed763a41
Fixed some corner cases around empty packages in upb.
...
1. upb now tolerates a present-but-empty package name in a `FileDescriptorProto`.
2. upb now always returns a non-NULL string for `upb_FileDef_Package()`. If the package was empty or missing, the returned string is zero-length. This better matches the proto2 behavior: `proto2::FileDescriptor::package()` always returns a package string, which may be empty.
PiperOrigin-RevId: 460831797
2 years ago
Mike Kruskal
17b6451684
Bumping protobuf dependency to newer commit
...
PiperOrigin-RevId: 460811319
2 years ago
Eric Salo
eb66ab601f
update upb_Encode() to use arena api internally
...
PiperOrigin-RevId: 460758866
2 years ago
Eric Salo
9b3e87307d
upb: upb_EnumDefs are now built using mini descriptors
...
Added upb_EnumDef_IsSorted() as an optimization for presorted enum protos
PiperOrigin-RevId: 460564949
2 years ago
Eric Salo
3c295eccf9
clean up the mini descriptor code:
...
- Correctly set the modifier field for MessageDefs
- Add error handling
- Respect the provided arena, stop hardwiring the global alloc
- upb_MiniDescriptor_EncodeExtension() is now upb_MiniDescriptor_EncodeField()
- Make the plugin code a lot easier to read
PiperOrigin-RevId: 460482002
2 years ago
Eric Salo
410143b265
split out some unicode logic from the json decoder
...
Also fixed a bug in the json decoder which caused it to break on a code point value of exactly 0x10ffff
PiperOrigin-RevId: 459856813
2 years ago
Protobuf Team Bot
1c13fd0686
first stab at a ZeroCopyStream api.
...
These functions are not yet part of the upb build but this is a good chunk of
work so let's snapshot it now.
PiperOrigin-RevId: 459156286
2 years ago
Protobuf Team Bot
8c44f04697
create and lock down upb/internal/array.h
...
Internal array functions are now implemented in upb/internal/array.c and declared in
upb/internal/array.h, which only has local visibility.
PiperOrigin-RevId: 458260144
2 years ago
Protobuf Team Bot
46e306bead
Move generator shared support code to common target.
...
PiperOrigin-RevId: 458257330
2 years ago
Joshua Haberman
125db89ff5
Added fuzz tests for mini table building and binary format parsing/serialization.
...
PiperOrigin-RevId: 458240180
2 years ago
Protobuf Team Bot
d44834063a
Add UPB_DEPRECATED macro to use for deprecated field code generation.
...
PiperOrigin-RevId: 457996196
2 years ago
Protobuf Team Bot
b6f862bf9f
Fix message clear not updating hasbit when message/group has presence. Add more tests.
...
PiperOrigin-RevId: 457822583
2 years ago
Protobuf Team Bot
da82d15714
Mark arena getter const.
...
PiperOrigin-RevId: 457763772
2 years ago
Protobuf Team Bot
49876f4633
Update sample of using upb_MtDataEncoder
...
PiperOrigin-RevId: 457501667
2 years ago
Protobuf Team Bot
0c78048723
clean the fences for the headers:
...
some headers were not including port_def.inc
some headers were not declaring extern "C"
some headers were backing out of the above in the wrong order
PiperOrigin-RevId: 457391878
2 years ago
Protobuf Team Bot
ca08ff5b74
lock down upb/internal/decode.h
...
PiperOrigin-RevId: 457116753
2 years ago
Protobuf Team Bot
033859ff5d
rename internal/upb.h as internal/encode.h
...
add build target for upb/internal/encode.h and lock down its visibility
PiperOrigin-RevId: 457087638
2 years ago
Protobuf Team Bot
15596be402
move table.c into upb/internal
...
PiperOrigin-RevId: 457044228
2 years ago
Protobuf Team Bot
7975945e61
clean up the dependency graph some more
...
PiperOrigin-RevId: 456890270
2 years ago
Protobuf Team Bot
1695cb2788
rename the upb_Array 'len' field as 'size'
...
Now that 'size' has been renamed as 'capacity' we are free to rename 'len' as
'size', so upb_Array_Size() is actually returning the 'size' field.
PiperOrigin-RevId: 456865972
2 years ago
Protobuf Team Bot
e153b52394
split out upb_StringView from upb.h
...
PiperOrigin-RevId: 456858455
2 years ago
Protobuf Team Bot
8d0d13f2bc
Fix the dependency chain for internal/arena.h
...
Clean up a few other superfluous #include's and forward declarations
PiperOrigin-RevId: 456851942
2 years ago
Protobuf Team Bot
83f4988561
rename the upb_Array 'size' field as 'capacity'
...
The current field/function names for upb_Array are quite confusing.
We will fix them in two steps, this being the first step.
PiperOrigin-RevId: 456687224
2 years ago