protobuf

Commit Graph

Author	SHA1	Message	Date
Joshua Haberman	65a85a63c8	Fixed two bugs with proto2 enums: 1. The mask was getting improperly truncated for values 32-63. 2. We were not handling duplicated enum values.	3 years ago
Joshua Haberman	c56fe27f88	Fix mistake in previous rename: upb_MessageDef_FindFieldByNameWithSize.	3 years ago
Joshua Haberman	032400a03e	Fixed data corruption when total hasbits are a power of two.	3 years ago
Joshua Haberman	606308c639	Added back missing underscore.	3 years ago
Joshua Haberman	75b6291e40	Renamed upb_FieldType_* -> kUpb_FieldType_*	3 years ago
Joshua Haberman	392531c14f	Fixed test by implementing FindMethodByName().	3 years ago
Joshua Haberman	fcf1db32b1	Fixed a few extraneous wrappings.	3 years ago
Joshua Haberman	499c2cc8b1	upb_extreg, upb_msg	3 years ago
Joshua Haberman	1c955f37ce	Mass API rename and clang-reformat (#485 ) * Wave 1: upb_fielddef. * upb_fielddef itself. * upb_oneofdef. * upb_msgdef. * ExtensionRange. * upb_enumdef * upb_enumvaldef * upb_filedef * upb_methoddef * upb_servicedef * upb_symtab * upb_defpool_init * upb_wellknown and upb_syntax_t * Some constants. * upb_status * upb_strview * upb_arena * upb.h constants * reflection * encode * JSON decode. * json encode. * msg_internal. * Formatted with clang-format. * Some naming fixups and comment reformatting. * More refinements. * A few more stragglers. * Fixed PyObject_HEAD with semicolon. Removed TODO entries.	3 years ago
Joshua Haberman	00c106f551	Addressed PR comments.	3 years ago
Joshua Haberman	7a24340a26	Fixed some more tests.	3 years ago
Joshua Haberman	fc725be5bc	Implemented proper unescaping of bytes defaults.	3 years ago
Joshua Haberman	aee30144cc	Fixed a couple bugs.	3 years ago
Joshua Haberman	54b605026d	Fixed a bug in ListFields().	3 years ago
Joshua Haberman	d2283ed219	Verify extension ranges, and addressed PR comments.	3 years ago
Joshua Haberman	df77ca5dbb	Check extension field numbers against extension ranges. This makes extension checking more strict in most cases. However it also fixes a bug with MessageSet where we were being too strict. MessageSet allows larger extension numbers than normal extensions do.	3 years ago
Joshua Haberman	7576a3bfc1	Avoid NULL + 0 when adding a list of 0 extensions.	3 years ago
Joshua Haberman	1845997498	Added comments.	3 years ago
Stan Hu	53250c8504	Fix encoding/decoding for def-to-proto on big-endian systems In a big-endian system, the 64-bit value of 1 is represented as: ``` 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 ``` However, when `d.int32_val` is used, this truncates this and takes the first four bytes: ``` 0x0 0x0 0x0 0x0 ``` As a result, we lose the value of 1 from this truncation and the value beocmes 0. This doesn't happen in a little-endian system because the 1 is in the lowest memory address, so truncating the value to 32 bits doesn't change anything. Previously the DefToProto test was failing on a big-endian system because this truncation caused the key to be incorrectly set to 0. We now use the type-specific functions (e.g. `upb_fielddef_defaultint32`) to do this conversion. Closes https://github.com/protocolbuffers/upb/issues/442	3 years ago
Joshua Haberman	d0795a29d9	Test for def_to_proto is working.	3 years ago
Joshua Haberman	f7980b7ed1	Restructured for simplicity and fixed fasttable parser.	3 years ago
Joshua Haberman	3d437bbcab	Some pre-PR fixes.	3 years ago
Joshua Haberman	7771a0515b	Addressed PR comments.	3 years ago
Joshua Haberman	16f763e4d6	Addressed PR comments.	3 years ago
Joshua Haberman	9d26c706e0	Removed dependency on popcount() intrinsic.	3 years ago
Joshua Haberman	7907ed913b	Expanded the test to cover packed fields also.	3 years ago
Joshua Haberman	401e1747b5	Addressed PR feedback.	3 years ago
Joshua Haberman	cc03669a17	Several changes to defs. Biggest/key changes: 1. Defs are now nested per the .proto file syntax. 2. Options are parsed and vended.	3 years ago
Joshua Haberman	2484d12c1c	Addressed PR comments.	3 years ago
Joshua Haberman	8c916941b0	MSET -> MSGSET	3 years ago
Joshua Haberman	6f89034249	Implemented support for MessageSet.	3 years ago
Joshua Haberman	b1bbbdd4e7	Addressed PR comments.	3 years ago
Joshua Haberman	ce012b7b55	Added support for extensions.	3 years ago
Joshua Haberman	3366d02f04	Addressed PR comments.	3 years ago
Joshua Haberman	5c28ab6b2c	Implemented upb_enumvaldef, for storing information about enumvals.	3 years ago
Joshua Haberman	53fba823de	Added missing upb_symtab_lookupext() function.	3 years ago
Joshua Haberman	cdd6434a31	Introduced upb_extreg and plumbed it into decoder.	4 years ago
Joshua Haberman	58e158c6fa	Changed mini-table to use a custom "mode" instead of descriptor's "label."	4 years ago
Joshua Haberman	807e7fe9e2	Fixed dense_below logic to be order-independent and consistent between def.c and codegen.	4 years ago
Joshua Haberman	2e8a122fc0	Changed dense_below calculation to use UINT8_MAX as the constant.	4 years ago
Joshua Haberman	6394894b6e	Addressed PR comments.	4 years ago
Joshua Haberman	65d7b8ab0c	Optimized decoder and paved the way for parsing extensions. The primary motivation for this change is to avoid referring to the `upb_msglayout` object when we are trying to fetch the `upb_msglayout` object for a sub-message. This will help pave the way for parsing extensions. We also implement several optimizations so that we can make this change without regressing performance. Normally we compute the layout for a sub-message field like so: ``` const upb_msglayout get_submsg_layout( const upb_msglayout layout, const upb_msglayout_field field) { return layout->submsgs[field->submsg_index] } ``` The reason for this indirection is to avoid storing a pointer directly in `upb_msglayout_field`, as this would double its size (from 12 to 24 bytes on 64-bit architectures) which is wasteful as this pointer is only needed for message typed fields. However `get_submsg_layout` as written above does not work for extensions, as they will not have entries in the message's `layout->submsgs` array by nature, and we want to avoid creating an entire fake `upb_msglayout` for each such extension since that would also be wasteful. This change removes the dependency on `upb_msglayout` by passing down the `submsgs` array instead: ``` const upb_msglayout get_submsg_layout( const upb_msglayout const submsgs, const upb_msglayout_field *field) { return submsgs[field->submsg_index] } ``` This will pave the way for parsing extensions, as we can more easily create an alternative `submsgs` array for extension fields without extra overhead or waste. Along the way several optimizations presented themselves that allow a nice increase in performance: 1. Passing the parsed `wireval` by address instead of by value ended up avoiding an expensive and useless stack copy (this is on Clang, which was used for all measurements). 2. When field numbers are densely packed, we can find a field by number with a single indexed lookup instead of linear search. At codegen time we can compute the maximum field number that will allow such an indexed lookup. 3. For fields that do require linear search, we can start the linear search at the location where we found the previous field, taking advantage of the fact that field numbers are generally increasing. 4. When the hasbit index is less than 32 (the common case) we can use a less expensive code sequence to set it. 5. We check for the hasbit case before the oneof case, as optional fields are more common than oneof fields. Benchmark results indicate a 20% improvement in parse speed with a small code size increase: ``` name old time/op new time/op delta ArenaOneAlloc 21.3ns ± 0% 21.5ns ± 0% +0.96% (p=0.000 n=12+12) ArenaInitialBlockOneAlloc 6.32ns ± 0% 6.32ns ± 0% +0.03% (p=0.000 n=12+10) LoadDescriptor_Upb 53.5µs ± 1% 51.5µs ± 2% -3.70% (p=0.000 n=12+12) LoadAdsDescriptor_Upb 2.78ms ± 2% 2.68ms ± 0% -3.57% (p=0.000 n=12+12) LoadDescriptor_Proto2 240µs ± 0% 240µs ± 0% +0.12% (p=0.001 n=12+12) LoadAdsDescriptor_Proto2 12.8ms ± 0% 12.7ms ± 0% -1.15% (p=0.000 n=12+10) Parse_Upb_FileDesc<UseArena,Copy> 13.2µs ± 2% 10.7µs ± 0% -18.49% (p=0.000 n=10+12) Parse_Upb_FileDesc<UseArena,Alias> 11.3µs ± 0% 9.6µs ± 0% -15.11% (p=0.000 n=12+11) Parse_Upb_FileDesc<InitBlock,Copy> 12.7µs ± 0% 10.3µs ± 0% -19.00% (p=0.000 n=10+12) Parse_Upb_FileDesc<InitBlock,Alias> 10.9µs ± 0% 9.2µs ± 0% -15.82% (p=0.000 n=12+12) Parse_Proto2<FileDesc,NoArena,Copy> 29.4µs ± 0% 29.5µs ± 0% +0.61% (p=0.000 n=12+12) Parse_Proto2<FileDesc,UseArena,Copy> 20.7µs ± 2% 20.6µs ± 2% ~ (p=0.260 n=12+11) Parse_Proto2<FileDesc,InitBlock,Copy> 16.7µs ± 1% 16.7µs ± 0% -0.25% (p=0.036 n=12+10) Parse_Proto2<FileDescSV,InitBlock,Alias> 16.5µs ± 0% 16.5µs ± 0% +0.20% (p=0.016 n=12+11) SerializeDescriptor_Proto2 5.30µs ± 1% 5.36µs ± 1% +1.09% (p=0.000 n=12+11) SerializeDescriptor_Upb 12.9µs ± 0% 13.0µs ± 0% +0.90% (p=0.000 n=12+11) FILE SIZE VM SIZE -------------- -------------- +1.5% +176 +1.6% +176 upb/decode.c +1.8% +176 +1.9% +176 decode_msg +0.4% +64 +0.4% +64 upb/def.c +1.4% +64 +1.4% +64 _upb_symtab_addfile +1.2% +48 +1.4% +48 upb/reflection.c +15% +32 +18% +32 upb_msg_set +2.9% +16 +3.1% +16 upb_msg_mutable -9.3% -288 [ = ] 0 [Unmapped] [ = ] 0 +0.2% +288 TOTAL ```	4 years ago
Joshua Haberman	9482957425	Enforce that filenames are unique when loaded into symtab. This brings upb into line with C++. PHP already checks this internally, so this should not be an issue there. Ruby on the other hand does not currently check this, so this change will cause our Ruby implementation to reject some programs that would otherwise have been accepted.	4 years ago
Joshua Haberman	823eb09694	Update all 2011 dates to 2021.	4 years ago
Joshua Haberman	e59d2c8fa7	Added license headers to all files.	4 years ago
Joshua Haberman	83c0edbd2a	A few minor cleanups.	4 years ago
Joshua Haberman	c358829c76	Now that handlers are gone, cleaned up table to use arenas exclusively. Also cleaned up some cruft from table.	4 years ago
Joshua Haberman	ec9ba3f893	Fixed error message buffer overflow.	4 years ago
Joshua Haberman	f41c0ec261	Added an internal API to get arena from symtab, for Ruby's use.	4 years ago
Joshua Haberman	f5d2d55007	Deleted the legacy "Handlers" APIs. upb can finally be deserving of its name. This is possible now that all users have been migrated to the new upb_msg APIs.	4 years ago

1 2 3 4 5

206 Commits (65a85a63c8c16a4b015b67885a41b94df973ebce)