protobuf

Commit Graph

Author	SHA1	Message	Date
Joshua Haberman	0c541f3305	Single encode.	3 years ago
Joshua Haberman	72af9dc0cc	Switch to a single upb_Decode.	3 years ago
Joshua Haberman	499c2cc8b1	upb_extreg, upb_msg	3 years ago
Joshua Haberman	1c955f37ce	Mass API rename and clang-reformat (#485 ) * Wave 1: upb_fielddef. * upb_fielddef itself. * upb_oneofdef. * upb_msgdef. * ExtensionRange. * upb_enumdef * upb_enumvaldef * upb_filedef * upb_methoddef * upb_servicedef * upb_symtab * upb_defpool_init * upb_wellknown and upb_syntax_t * Some constants. * upb_status * upb_strview * upb_arena * upb.h constants * reflection * encode * JSON decode. * json encode. * msg_internal. * Formatted with clang-format. * Some naming fixups and comment reformatting. * More refinements. * A few more stragglers. * Fixed PyObject_HEAD with semicolon. Removed TODO entries.	3 years ago
Joshua Haberman	3921e02990	Fixed make_cmakelists.py.	3 years ago
Joshua Haberman	a0374b3b08	Added required field checking into the encoder.	3 years ago
Joshua Haberman	c59d8f8eb7	Addressed PR comments and fixed the broken test.	3 years ago
Joshua Haberman	58c1dbc11f	Addressed PR comments.	3 years ago
Joshua Haberman	3d437bbcab	Some pre-PR fixes.	3 years ago
Joshua Haberman	4307f5dbba	Fixed the CMake build and amalgamation.	3 years ago
Joshua Haberman	c755099a89	WIP.	3 years ago
Joshua Haberman	401e1747b5	Addressed PR feedback.	3 years ago
Joshua Haberman	b1bbbdd4e7	Addressed PR comments.	3 years ago
Joshua Haberman	ce012b7b55	Added support for extensions.	3 years ago
Joshua Haberman	7183780b60	Added a Valgrind test that works for Python!	3 years ago
Joshua Haberman	5d8c3db94f	Added copyright header and docs for python_headers().	3 years ago
Joshua Haberman	f098230df8	Exclude fuzz test from non-Clang compilers.	3 years ago
Joshua Haberman	fa4d70fad6	Restore CMake files, we're not ready to delete them yet.	3 years ago
Joshua Haberman	173554146f	Updated some docs and removed/rearranged some obsolete stuff.	3 years ago
Joshua Haberman	c4744c0b21	Updated generated files.	3 years ago
Joshua Haberman	91d506ac32	Ported ABSL's wyhash to C.	3 years ago
Joshua Haberman	6e53de4a03	Addressed PR comments.	4 years ago
Joshua Haberman	cdd6434a31	Introduced upb_extreg and plumbed it into decoder.	4 years ago
Joshua Haberman	58e158c6fa	Changed mini-table to use a custom "mode" instead of descriptor's "label."	4 years ago
Joshua Haberman	65d7b8ab0c	Optimized decoder and paved the way for parsing extensions. The primary motivation for this change is to avoid referring to the `upb_msglayout` object when we are trying to fetch the `upb_msglayout` object for a sub-message. This will help pave the way for parsing extensions. We also implement several optimizations so that we can make this change without regressing performance. Normally we compute the layout for a sub-message field like so: ``` const upb_msglayout get_submsg_layout( const upb_msglayout layout, const upb_msglayout_field field) { return layout->submsgs[field->submsg_index] } ``` The reason for this indirection is to avoid storing a pointer directly in `upb_msglayout_field`, as this would double its size (from 12 to 24 bytes on 64-bit architectures) which is wasteful as this pointer is only needed for message typed fields. However `get_submsg_layout` as written above does not work for extensions, as they will not have entries in the message's `layout->submsgs` array by nature, and we want to avoid creating an entire fake `upb_msglayout` for each such extension since that would also be wasteful. This change removes the dependency on `upb_msglayout` by passing down the `submsgs` array instead: ``` const upb_msglayout get_submsg_layout( const upb_msglayout const submsgs, const upb_msglayout_field *field) { return submsgs[field->submsg_index] } ``` This will pave the way for parsing extensions, as we can more easily create an alternative `submsgs` array for extension fields without extra overhead or waste. Along the way several optimizations presented themselves that allow a nice increase in performance: 1. Passing the parsed `wireval` by address instead of by value ended up avoiding an expensive and useless stack copy (this is on Clang, which was used for all measurements). 2. When field numbers are densely packed, we can find a field by number with a single indexed lookup instead of linear search. At codegen time we can compute the maximum field number that will allow such an indexed lookup. 3. For fields that do require linear search, we can start the linear search at the location where we found the previous field, taking advantage of the fact that field numbers are generally increasing. 4. When the hasbit index is less than 32 (the common case) we can use a less expensive code sequence to set it. 5. We check for the hasbit case before the oneof case, as optional fields are more common than oneof fields. Benchmark results indicate a 20% improvement in parse speed with a small code size increase: ``` name old time/op new time/op delta ArenaOneAlloc 21.3ns ± 0% 21.5ns ± 0% +0.96% (p=0.000 n=12+12) ArenaInitialBlockOneAlloc 6.32ns ± 0% 6.32ns ± 0% +0.03% (p=0.000 n=12+10) LoadDescriptor_Upb 53.5µs ± 1% 51.5µs ± 2% -3.70% (p=0.000 n=12+12) LoadAdsDescriptor_Upb 2.78ms ± 2% 2.68ms ± 0% -3.57% (p=0.000 n=12+12) LoadDescriptor_Proto2 240µs ± 0% 240µs ± 0% +0.12% (p=0.001 n=12+12) LoadAdsDescriptor_Proto2 12.8ms ± 0% 12.7ms ± 0% -1.15% (p=0.000 n=12+10) Parse_Upb_FileDesc<UseArena,Copy> 13.2µs ± 2% 10.7µs ± 0% -18.49% (p=0.000 n=10+12) Parse_Upb_FileDesc<UseArena,Alias> 11.3µs ± 0% 9.6µs ± 0% -15.11% (p=0.000 n=12+11) Parse_Upb_FileDesc<InitBlock,Copy> 12.7µs ± 0% 10.3µs ± 0% -19.00% (p=0.000 n=10+12) Parse_Upb_FileDesc<InitBlock,Alias> 10.9µs ± 0% 9.2µs ± 0% -15.82% (p=0.000 n=12+12) Parse_Proto2<FileDesc,NoArena,Copy> 29.4µs ± 0% 29.5µs ± 0% +0.61% (p=0.000 n=12+12) Parse_Proto2<FileDesc,UseArena,Copy> 20.7µs ± 2% 20.6µs ± 2% ~ (p=0.260 n=12+11) Parse_Proto2<FileDesc,InitBlock,Copy> 16.7µs ± 1% 16.7µs ± 0% -0.25% (p=0.036 n=12+10) Parse_Proto2<FileDescSV,InitBlock,Alias> 16.5µs ± 0% 16.5µs ± 0% +0.20% (p=0.016 n=12+11) SerializeDescriptor_Proto2 5.30µs ± 1% 5.36µs ± 1% +1.09% (p=0.000 n=12+11) SerializeDescriptor_Upb 12.9µs ± 0% 13.0µs ± 0% +0.90% (p=0.000 n=12+11) FILE SIZE VM SIZE -------------- -------------- +1.5% +176 +1.6% +176 upb/decode.c +1.8% +176 +1.9% +176 decode_msg +0.4% +64 +0.4% +64 upb/def.c +1.4% +64 +1.4% +64 _upb_symtab_addfile +1.2% +48 +1.4% +48 upb/reflection.c +15% +32 +18% +32 upb_msg_set +2.9% +16 +3.1% +16 upb_msg_mutable -9.3% -288 [ = ] 0 [Unmapped] [ = ] 0 +0.2% +288 TOTAL ```	4 years ago
Joshua Haberman	3881393907	Renamed .int.h to _internal.h, for greater clarity.	4 years ago
Joshua Haberman	823eb09694	Update all 2011 dates to 2021.	4 years ago
Joshua Haberman	5f74d43cf9	Re-add some comment text that was accidentally removed.	4 years ago
Joshua Haberman	e59d2c8fa7	Added license headers to all files.	4 years ago
Joshua Haberman	1674f28dd7	Put public message interface into msg.h and moved internal functions to msg.int.h.	4 years ago
Joshua Haberman	f5d2d55007	Deleted the legacy "Handlers" APIs. upb can finally be deserving of its name. This is possible now that all users have been migrated to the new upb_msg APIs.	4 years ago
Joshua Haberman	c7787cbaa1	Fixed a bunch of Clang warnings. Unfortunately a few of the Clang warnings did not have easy fixes: ../../../../ext/google/protobuf_c/ruby-upb.c: In function ‘fastdecode_err’: ../../../../ext/google/protobuf_c/ruby-upb.c:353:13: warning: function might be candidate for attribute ‘noreturn’ [-Wsuggest-attribute=noreturn] 353 \| const char fastdecode_err(upb_decstate d) { \| ^~~~~~~~~~~~~~ ../../../../ext/google/protobuf_c/ruby-upb.c: In function ‘_upb_decode’: ../../../../ext/google/protobuf_c/ruby-upb.c:867:30: warning: argument ‘buf’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] 867 \| bool _upb_decode(const char buf, size_t size, void msg, I even tried to suppress the first error, but it still shows up.	4 years ago
Joshua Haberman	5e550e88f8	Added API for getting fielddef default as a upb_msgval.	4 years ago
Joshua Haberman	5f8bb5de1d	Updated generated code.	4 years ago
Joshua Haberman	e9b79542ad	Added a BUILD file for wyhash. This will make the build more closely resemble the google3 build. The CMake output from this is a bit busted, but the build does succeed.	4 years ago
Joshua Haberman	6c16cba83f	Removed obsolete port.c file.	4 years ago
Joshua Haberman	43c207ea7e	Added CMake dummy rule.	4 years ago
Joshua Haberman	e8f9eac68c	Added #defines UPB_ENABLE_FASTTABLE and UPB_TRY_ENABLE_FASTTABLE. These control whether fasttable decoding is on.	4 years ago
Joshua Haberman	e86541ac1d	Fixed the build after the merge.	4 years ago
Joshua Haberman	7e5bd65098	Plumbed copts (including the crucial -std=c99) to upb_proto_library() aspect.	4 years ago
Joshua Haberman	8f3ee80d46	Drop C89/C90 support and MSVC prior to Visual Studio 2015. upb previously attempted to support C89 and pre-2015 versions of Visual Studio. This was to support older compilers with limited C99 support (particularly MSVC). But as of last August, even gRPC has dropped support for MSVC prior to 2015 `c87276d058` Therefore it seems safe for upb to no longer attempt C89 support (we were already not truly C89 compliant, with our use of "bool"). We now explicitly require C99 or greater and MSVC 2015 or greater. This cleaned up port_def.inc a fair bit. I took the chance to also remove some obsolete macros.	4 years ago
Joshua Haberman	a274ad786a	Plumbed copts (including the crucial -std=c99) to upb_proto_library() aspect.	4 years ago
Joshua Haberman	efd576b698	Added -std=gnu99 for fastdecode and ran Buildifier.	4 years ago
Joshua Haberman	b928696942	A few more fixes, and test fastdecode under Kokoro.	4 years ago
Joshua Haberman	bd9f8f580d	Fixed a few bugs with the fast decoder. 1. For long tags we were putting table entries in the wrong slot. 2. For repeated strings, when the buffer flipped to no longer alias we were failing to notice and kept aliasing anyway.	4 years ago
Joshua Haberman	3eba47914b	Allocate hasbits and table slots in "hotness" order. Without a profile, we assume that fields with smaller numbers are hotter.	4 years ago
Joshua Haberman	021db6fcd5	Allow larger tags into the table if they are unique mod 31. Also fixed a bug with fixed packed in decode_fast.c.	4 years ago
Joshua Haberman	86d9908c55	Fastdecode support for packed fields. This is not very optimized yet. There is a lot of room to optimize it further.	4 years ago
Joshua Haberman	e3e797b680	Added fasttable support for oneofs.	4 years ago
Joshua Haberman	e2c709e047	Repeated string and primitive support. Much of the code was adapted from Gerben's code in: `6333031195`	4 years ago

1 2

61 Commits (205a7ea8b16bf430f2040e973f9add7c8f98f178)