protobuf

Commit Graph

Author	SHA1	Message	Date
Joshua Haberman	ded2e657a7	Added compatibility with old generated code. Until everyone can regenerate their code, we need to provide compatible semantics with the old generated code. Also fixed a bug where enums were allocated 8 bytes instead of 4.	4 years ago
Joshua Haberman	75edd3e59c	Changed to use table pairs, seems to ever-so-slightly regress.	4 years ago
Joshua Haberman	bca7edac8c	Cleaned up table compression a bit.	4 years ago
Joshua Haberman	a6dc88556d	Tables are compressed, but perf goes down to 2.44GB/s.	4 years ago
Joshua Haberman	a4966fd230	Added a few extra sanity checks.	4 years ago
Joshua Haberman	99acbe0da8	Fixed bug where submsg array could have excess elements. Before we were allocating an array element for every sub-message field, even if two different fields had messages of the same type.	4 years ago
Gerben Stavenga	3f719fa6b2	Bugfix: offsetting hasbits with 16 introduced a bug in calculating hasmasks. Removing extra <<16 shift in hasmask calculating and masking out the first 16 bits. This makes messages without hasbits work as well.	4 years ago
Gerben Stavenga	4053805759	Bugfixes	4 years ago
Joshua Haberman	71749b7caf	Implemented inline array allocation, and moved type->lg2 map to reflection.	4 years ago
Joshua Haberman	9557b97acc	Implemented inline array allocation, and moved type->lg2 map to reflection.	4 years ago
Gerben Stavenga	36662b3735	Refactor some code. I extracted some common code from all message field parsers, to a tail recursive function. Removed the varint jmp table for a simple varint parse loop, that removes the stack frames. Also careful with not losing information in repeated message tag check. When written mindful the checks and loads that happen can be reused for tag dispatch if not the expected tag.	4 years ago
Joshua Haberman	9938cf8f27	Put submsg_index directly in table data. Drop oneof support for now to focus.	4 years ago
Joshua Haberman	526e430794	I think this may have reached the optimization limit. ------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------- BM_ArenaOneAlloc 21 ns 21 ns 32994231 BM_ArenaInitialBlockOneAlloc 6 ns 6 ns 116318005 BM_ParseDescriptorNoHeap 3028 ns 3028 ns 231138 2.34354GB/s BM_ParseDescriptor 3557 ns 3557 ns 196583 1.99498GB/s BM_ParseDescriptorProto2NoArena 33228 ns 33226 ns 21196 218.688MB/s BM_ParseDescriptorProto2WithArena 22863 ns 22861 ns 30666 317.831MB/s BM_SerializeDescriptorProto2 5444 ns 5444 ns 127368 1.30348GB/s BM_SerializeDescriptor 12509 ns 12508 ns 55816 580.914MB/s $ perf stat bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap 2020-10-08 14:07:06 Running bazel-bin/benchmark Run on (72 X 3700 MHz CPU s) CPU Caches: L1 Data 32K (x36) L1 Instruction 32K (x36) L2 Unified 1024K (x36) L3 Unified 25344K (x2) ---------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------- BM_ParseDescriptorNoHeap 3071 ns 3071 ns 227743 2.31094GB/s Performance counter stats for 'bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap': 1,050.22 msec task-clock # 0.978 CPUs utilized 4 context-switches # 0.004 K/sec 0 cpu-migrations # 0.000 K/sec 179 page-faults # 0.170 K/sec 3,875,796,334 cycles # 3.690 GHz 13,282,835,967 instructions # 3.43 insn per cycle 2,887,725,848 branches # 2749.627 M/sec 8,324,912 branch-misses # 0.29% of all branches 1.073924364 seconds time elapsed 1.042806000 seconds user 0.008021000 seconds sys Profile: 23.96% benchmark benchmark [.] upb_prm_1bt_max192b 22.44% benchmark benchmark [.] fastdecode_dispatch 18.96% benchmark benchmark [.] upb_pss_1bt 14.20% benchmark benchmark [.] upb_psv4_1bt 8.33% benchmark benchmark [.] upb_prm_1bt_max64b 6.66% benchmark benchmark [.] upb_prm_1bt_max128b 1.29% benchmark benchmark [.] upb_psm_1bt_max64b 0.77% benchmark benchmark [.] fastdecode_generic 0.55% benchmark [kernel.kallsyms] [k] smp_call_function_single 0.42% benchmark [kernel.kallsyms] [k] _raw_spin_lock_irqsave 0.42% benchmark benchmark [.] upb_psm_1bt_max256b 0.31% benchmark benchmark [.] upb_psb1_1bt 0.21% benchmark benchmark [.] upb_plv4_5bv 0.14% benchmark benchmark [.] upb_psb1_2bt 0.12% benchmark benchmark [.] decode_longvarint64 0.08% benchmark [kernel.kallsyms] [k] vsnprintf 0.07% benchmark [kernel.kallsyms] [k] _raw_spin_lock 0.07% benchmark benchmark [.] _upb_msg_new 0.06% benchmark ld-2.31.so [.] check_match	4 years ago
Joshua Haberman	52a0ed3891	Fixed a bug with tag number 15.	4 years ago
Joshua Haberman	9e5c5ce089	Optimized memset() with cutoff and fixed group & unknown message bugs.	4 years ago
Joshua Haberman	88b1ec7784	Table-driven supports repeated sub-messages.	4 years ago
Joshua Haberman	f173642db4	Handle non-repeated submessages.	4 years ago
Joshua Haberman	7ec2c52346	Donate/steal from arena to accelerate decoding.	4 years ago
Joshua Haberman	fac992db83	Cleanup for showing.	4 years ago
Joshua Haberman	438ecaeb5a	Give all field parsers a generic table entry.	4 years ago
Joshua Haberman	a77ea639d5	Verify UTF-8 when parsing proto3 string fields.	4 years ago
Joshua Haberman	8f11ec57d2	Applied changes from google3.	5 years ago
Joshua Haberman	b717575cef	Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors. (#289 ) * Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors. * Disable -Wshorten-32-to-64 since Kokoro is missing Clang. * Fixed -Wextra warnings for gcc. * Reordered UPB_UNUSED() to come after declarations. * Added another -pedantic fix and log CC version. * Fix compile error and conditionally run use_bazel.sh. * Moved set -e after use_bazel.sh. * Fixed typo in conditional.	5 years ago
Joshua Haberman	634d37515c	Bugfix for oneofs and added line/col info to JSON.	5 years ago
Joshua Haberman	543a0ce8f2	Fixes for PHP. (#286 ) - A new PHP-specific upb amalgamation. It contains everything related to upb_msg, but leaves out all of the old handlers-related interfaces and encoders/decoders. # Schema/Defs Changes - Changed `upb_fielddef_msgsubdef()` and `upb_fielddef_enumsubdef()` to return `NULL` instead of assert-failing if the field is not a message or enum. - Added `upb_msgdef_iswrapper()`, to test whether this is a wrapper well-known type. # Decoder - Decoder bugfix: when we parse a submessage inside a oneof, we need to clear out any previous data, so we don't misinterpret it as a pointer to an existing submessage. # JSON Decoder - Allowed well-known types at the top level to have their special processing. - Fixed a bug that could occur when parsing nested empty lists/objects, eg `[[]]`. - Made the "ignore unknown" option also be permissive about unknown enumerators by setting them to 0. # JSON Encoder - Allowed well-known types at the top level to have their special processing. - Removed all spaces after `:` and `,` characters, to match the old encoder and pass goldenfile tests. # Message / Reflection - Changed `upb_msg_hasoneof()` -> `upb_msg_whichoneof()`. The new function returns the `upb_fielddef*` of whichever oneof is set. - Implemented `upb_msg_clearfield()` and added/implemented `upb_msg_clear()`. - Added `upb_msg_discardunknown()`. Part of me thinks this should go in a util library instead of core reflection since it is a recursive algorithm. # Compiler - Always emit descriptors as an array instead of as a string, to avoid exceeding maximum string lengths. If this becomes a speed issue later we can go back to two separate paths.	5 years ago
Joshua Haberman	0842f88211	Support for proto3 optional. (#270 ) * Added support for proto3 optional to defs. * Added proto3 optional support. Untested at the moment. * Changes to support proto3 optional. * Fixed real oneof count for messages with no fields. * Fixed compile error and test. * Added comment about why I'm commenting out the assert.	5 years ago
Joshua Haberman	38a1045975	Added a has_foo() generated method for proto3 submessage fields. (#266 ) This is better than checking against NULL, because in the future unset fields will (probably) return a default instance instead of NULL.	5 years ago
Joshua Haberman	378cbbc3cc	Updated to new protobuf version, and added support for packed=false. (#264 ) * WIP. * Passes most tests. * A few fixes. * A few optimizations. * Some more optimiation. * Update Protobuf to v3.11.4 and Abseil to LTS 2020-02-25 * Use longjmp instead of explicit error checks at every level. * Used macros for better documentation of ops. * Fixed bug with map parsing. All tests are passing except a few conformance tests. * Fixed remaining bugs, all conformance tests pass. Also ported all of upb to a single UPB_PTR_AT() macro instead of having multiple .c files define their own. * Formatted with clang-format. * Fixes to compile on Linux. * A few more compile fixes. * Script to benchmark changes. * Fixed parenthesis bug in op calculation. * Updated generated descriptor files. * WIP. * Removed trailing enum to fix the Linux build. * Respect packed=false to fix conformance failures in new protobuf version. * Small simplification. * Fixes to decoder. * Removed stray comment. Co-authored-by: Yannic Bonenberger <contact@yannic-bonenberger.com>	5 years ago
Joshua Haberman	08b6d2d6fd	Rewrite of the decoder (#263 ) New code is smaller (in both source size and compiled size) and faster. # Speed The decoder speeds up on all machines I tested, though the amount of speedup varies. I was only able to test Intel CPUs. ### Linux Desktop ``` CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz OS: Linux name old time/op new time/op delta CreateArena 4.72ns ± 0% 4.93ns ± 0% +4.47% (p=0.000 n=11+11) ParseDescriptor 12.4µs ± 1% 9.1µs ± 1% -26.65% (p=0.000 n=11+11) ``` ### Mac Laptop ``` CPU: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz OS: macOS name old time/op new time/op delta CreateArena 5.33ns ± 3% 5.58ns ± 2% +4.69% (p=0.000 n=12+12) ParseDescriptor 15.0µs ± 2% 11.9µs ± 2% -20.20% (p=0.000 n=12+12) ``` ### Linux Workstation ``` CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz OS: Linux name old time/op new time/op delta CreateArena 5.29ns ± 0% 5.52ns ± 0% +4.37% (p=0.000 n=10+12) ParseDescriptor 18.6µs ± 0% 16.4µs ± 0% -11.54% (p=0.000 n=12+12) ``` # Size A few source files grow marginally because of some arena functionality moved inline. But `upb/decode.c` shrinks by 30% on Linux: ``` VM SIZE -------------- +2.1% +283 upb/json_decode.c +24% +205 upb/msg.c +8.4% +115 upb/upb.c +0.9% +28 upb/reflection.c [ = ] 0 upb/def.c [ = ] 0 upb/encode.c [ = ] 0 upb/json_encode.c [ = ] 0 upb/table.c -30.3% -1.51Ki upb/decode.c -0.7% -738 TOTAL ```	5 years ago
Joshua Haberman	b409f8cd85	Fixed code generator for upbdefs when a file has no messages.	5 years ago
Joshua Haberman	ca512852f3	Fixed parsing for string->double maps. (#243 ) Map parsing/serializing relies on map entries always having a predictable order. The code that generates layout was not respecting this in the case of string keys and primitive values.	5 years ago
Joshua Haberman	2a85bef825	Generated code interface for maps is complete, though not yet tested.	5 years ago
Joshua Haberman	382f92a87f	Maps encode and decode successfully!	5 years ago
Joshua Haberman	4c57b1fefd	More progress on Lua extension.	5 years ago
Joshua Haberman	5239655b99	WIP.	5 years ago
Joshua Haberman	23825332e1	WIP.	5 years ago
Joshua Haberman	dc58b657ee	New reflection API doesn't need types as parameters for map/array. All tests are passing again.	5 years ago
Joshua Haberman	c486da3970	WIP.	5 years ago
Joshua Haberman	ba0a2fb955	Compiles, doesn't work yet.	6 years ago
Joshua Haberman	c58541ea04	Added support for public dependencies.	6 years ago
Joshua Haberman	ef9499cb44	Migrate std::unordered_map -> absl::flat_hash_map.	6 years ago
Joshua Haberman	151ebc8a29	Fixed oneof case accessor to cast to enum for C++.	6 years ago
Josh Haberman	0c64c4b594	WIP.	6 years ago
Joshua Haberman	cf35baa1ad	Moved macros from upb.h to port_def.inc to avoid leaking them to users. (#160 ) * Use port_def.inc to prevent macros from leaking to users. * Added helpful comments to port_def.inc/port_undef.inc.	6 years ago
Josh Haberman	e86e198690	Changed enums to be open int32_t. This also fixes a bug where generated code wouldn't compile if the field's enum was defined in another file.	6 years ago
Josh Haberman	32e3f394b4	A few small API tweaks. - Foo_parsenew() -> Foo_parse(). - parse function takes plain (const char*, size_t) instead of upb_strview. The latter is mainly useful for strings inside message objects.	6 years ago
Joshua Haberman	cb26d883d1	WIP.	6 years ago
Josh Haberman	9ea6bb4678	Renamed upb_stringview -> upb_strview for C terseness.	6 years ago
Josh Haberman	aac4d03420	Standardize on package_name_Message_mutable_foo() for mutable accessors.	6 years ago
Joshua Haberman	0ce9b81815	Fixed bugs in array accessors.	6 years ago

... 2 3 4 5 6

287 Commits (3475ebec9484eb61e35c43c1aace669c4d16a25f)