Joshua Haberman
823eb09694
Update all 2011 dates to 2021.
4 years ago
Joshua Haberman
e59d2c8fa7
Added license headers to all files.
4 years ago
Joshua Haberman
1674f28dd7
Put public message interface into msg.h and moved internal functions to msg.int.h.
4 years ago
Esun Kim
9b020d8f65
Optimize calls to std::string::find() and friends for a single char.
4 years ago
Joshua Haberman
7a54a5f3d6
Split the code generators for .upb and .upbdefs.
...
Before there was a single code generator that generated both
.upb and .upbdefs, even though they are generated by different
rules. This worked fine as long as the codegen steps were
sandboxed, but if not it led to build errors.
Fixes https://github.com/protocolbuffers/upb/issues/354 .
4 years ago
Joshua Haberman
65d166a6ba
Added API for copy vs. alias and added benchmarks to test both.
...
Benchmark output:
$ bazel-bin/benchmarks/benchmark '--benchmark_filter=BM_Parse'
2020-11-11 15:39:04
Running bazel-bin/benchmarks/benchmark
Run on (72 X 3700 MHz CPU s)
CPU Caches:
L1 Data 32K (x36)
L1 Instruction 32K (x36)
L2 Unified 1024K (x36)
L3 Unified 25344K (x2)
-------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc<UseArena, Copy> 4134 ns 4134 ns 168714 1.69152GB/s
BM_Parse_Upb_FileDesc<UseArena, Alias> 3487 ns 3487 ns 199509 2.00526GB/s
BM_Parse_Upb_FileDesc<InitBlock, Copy> 3727 ns 3726 ns 187581 1.87643GB/s
BM_Parse_Upb_FileDesc<InitBlock, Alias> 3110 ns 3110 ns 224970 2.24866GB/s
BM_Parse_Proto2<FileDesc, NoArena, Copy> 31132 ns 31132 ns 22437 229.995MB/s
BM_Parse_Proto2<FileDesc, UseArena, Copy> 21011 ns 21009 ns 33922 340.812MB/s
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 17976 ns 17975 ns 38808 398.337MB/s
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 17357 ns 17356 ns 40244 412.539MB/s
4 years ago
Joshua Haberman
a01f3e23a4
Fixes for google3 build, and exclude even more tests from macOS to avoid timeout.
4 years ago
Joshua Haberman
154f2c25f4
Added UTF-8 validation for proto3 string fields.
4 years ago
Joshua Haberman
e8f9eac68c
Added #defines UPB_ENABLE_FASTTABLE and UPB_TRY_ENABLE_FASTTABLE.
...
These control whether fasttable decoding is on.
4 years ago
Joshua Haberman
bd9f8f580d
Fixed a few bugs with the fast decoder.
...
1. For long tags we were putting table entries in the wrong slot.
2. For repeated strings, when the buffer flipped to no longer alias we
were failing to notice and kept aliasing anyway.
4 years ago
Joshua Haberman
3eba47914b
Allocate hasbits and table slots in "hotness" order.
...
Without a profile, we assume that fields with smaller numbers
are hotter.
4 years ago
Joshua Haberman
021db6fcd5
Allow larger tags into the table if they are unique mod 31.
...
Also fixed a bug with fixed packed in decode_fast.c.
4 years ago
Joshua Haberman
86d9908c55
Fastdecode support for packed fields.
...
This is not very optimized yet. There is a lot of room to
optimize it further.
4 years ago
Joshua Haberman
e3e797b680
Added fasttable support for oneofs.
4 years ago
Joshua Haberman
e2c709e047
Repeated string and primitive support.
...
Much of the code was adapted from Gerben's code in:
6333031195
4 years ago
Joshua Haberman
a345af9883
Added a codegen parameter for whether fasttables are generated or not.
...
Example:
$ CC=clang bazel build -c opt --copt=-g benchmarks:benchmark --//:fasttable_enabled=false
INFO: Build option --//:fasttable_enabled has changed, discarding analysis cache.
INFO: Analyzed target //benchmarks:benchmark (0 packages loaded, 913 targets configured).
INFO: Found 1 target...
Target //benchmarks:benchmark up-to-date:
bazel-bin/benchmarks/benchmark
INFO: Elapsed time: 0.760s, Critical Path: 0.58s
INFO: 7 processes: 1 internal, 6 linux-sandbox.
INFO: Build completed successfully, 7 total actions
$ bazel-bin/benchmarks/benchmark --benchmark_filter=BM_Parse_Upb
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc_WithArena 10985 ns 10984 ns 63567 651.857MB/s
BM_Parse_Upb_FileDesc_WithInitialBlock 10556 ns 10554 ns 66138 678.458MB/s
$ CC=clang bazel build -c opt --copt=-g benchmarks:benchmark --//:fasttable_enabled=true
INFO: Build option --//:fasttable_enabled has changed, discarding analysis cache.
INFO: Analyzed target //benchmarks:benchmark (0 packages loaded, 913 targets configured).
INFO: Found 1 target...
Target //benchmarks:benchmark up-to-date:
bazel-bin/benchmarks/benchmark
INFO: Elapsed time: 0.744s, Critical Path: 0.58s
INFO: 7 processes: 1 internal, 6 linux-sandbox.
INFO: Build completed successfully, 7 total actions
$ bazel-bin/benchmarks/benchmark --benchmark_filter=BM_Parse_Upb
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc_WithArena 3284 ns 3284 ns 213495 2.1293GB/s
BM_Parse_Upb_FileDesc_WithInitialBlock 2882 ns 2882 ns 243069 2.4262GB/s
Biggest unknown is whether this parameter should default to true or false.
4 years ago
Joshua Haberman
2c1664906a
Removed license comments and upb_amalgamation for google3.
4 years ago
Joshua Haberman
b7dc77415a
Added licenses() to all BUILD files.
4 years ago
Joshua Haberman
e3f41de6c7
Split monolithic BUILD file into many build files.
4 years ago
gerben-s
9e68ec033f
Add repeated varints and fixed parsers
4 years ago
Joshua Haberman
b9f1b67d07
Use quoted include.
4 years ago
Joshua Haberman
1c8c16b9b1
Use quoted include.
4 years ago
Joshua Haberman
c81113e60f
Added fallback code for when no enum matches.
4 years ago
Joshua Haberman
ded2e657a7
Added compatibility with old generated code.
...
Until everyone can regenerate their code, we need to provide
compatible semantics with the old generated code.
Also fixed a bug where enums were allocated 8 bytes instead
of 4.
4 years ago
Joshua Haberman
75edd3e59c
Changed to use table pairs, seems to ever-so-slightly regress.
4 years ago
Joshua Haberman
bca7edac8c
Cleaned up table compression a bit.
4 years ago
Joshua Haberman
a6dc88556d
Tables are compressed, but perf goes down to 2.44GB/s.
4 years ago
Joshua Haberman
a4966fd230
Added a few extra sanity checks.
4 years ago
Joshua Haberman
99acbe0da8
Fixed bug where submsg array could have excess elements.
...
Before we were allocating an array element for every sub-message
field, even if two different fields had messages of the same type.
4 years ago
Gerben Stavenga
3f719fa6b2
Bugfix: offsetting hasbits with 16 introduced a bug in calculating
...
hasmasks. Removing extra <<16 shift in hasmask calculating and masking
out the first 16 bits. This makes messages without hasbits work as well.
4 years ago
Gerben Stavenga
4053805759
Bugfixes
4 years ago
Joshua Haberman
71749b7caf
Implemented inline array allocation, and moved type->lg2 map to reflection.
4 years ago
Joshua Haberman
9557b97acc
Implemented inline array allocation, and moved type->lg2 map to reflection.
4 years ago
Gerben Stavenga
36662b3735
Refactor some code. I extracted some common code from all message field
...
parsers, to a tail recursive function. Removed the varint jmp table for
a simple varint parse loop, that removes the stack frames. Also careful
with not losing information in repeated message tag check. When written
mindful the checks and loads that happen can be reused for tag dispatch
if not the expected tag.
4 years ago
Joshua Haberman
9938cf8f27
Put submsg_index directly in table data. Drop oneof support for now to focus.
4 years ago
Joshua Haberman
526e430794
I think this may have reached the optimization limit.
...
-------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------
BM_ArenaOneAlloc 21 ns 21 ns 32994231
BM_ArenaInitialBlockOneAlloc 6 ns 6 ns 116318005
BM_ParseDescriptorNoHeap 3028 ns 3028 ns 231138 2.34354GB/s
BM_ParseDescriptor 3557 ns 3557 ns 196583 1.99498GB/s
BM_ParseDescriptorProto2NoArena 33228 ns 33226 ns 21196 218.688MB/s
BM_ParseDescriptorProto2WithArena 22863 ns 22861 ns 30666 317.831MB/s
BM_SerializeDescriptorProto2 5444 ns 5444 ns 127368 1.30348GB/s
BM_SerializeDescriptor 12509 ns 12508 ns 55816 580.914MB/s
$ perf stat bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap
2020-10-08 14:07:06
Running bazel-bin/benchmark
Run on (72 X 3700 MHz CPU s)
CPU Caches:
L1 Data 32K (x36)
L1 Instruction 32K (x36)
L2 Unified 1024K (x36)
L3 Unified 25344K (x2)
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_ParseDescriptorNoHeap 3071 ns 3071 ns 227743 2.31094GB/s
Performance counter stats for 'bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap':
1,050.22 msec task-clock # 0.978 CPUs utilized
4 context-switches # 0.004 K/sec
0 cpu-migrations # 0.000 K/sec
179 page-faults # 0.170 K/sec
3,875,796,334 cycles # 3.690 GHz
13,282,835,967 instructions # 3.43 insn per cycle
2,887,725,848 branches # 2749.627 M/sec
8,324,912 branch-misses # 0.29% of all branches
1.073924364 seconds time elapsed
1.042806000 seconds user
0.008021000 seconds sys
Profile:
23.96% benchmark benchmark [.] upb_prm_1bt_max192b
22.44% benchmark benchmark [.] fastdecode_dispatch
18.96% benchmark benchmark [.] upb_pss_1bt
14.20% benchmark benchmark [.] upb_psv4_1bt
8.33% benchmark benchmark [.] upb_prm_1bt_max64b
6.66% benchmark benchmark [.] upb_prm_1bt_max128b
1.29% benchmark benchmark [.] upb_psm_1bt_max64b
0.77% benchmark benchmark [.] fastdecode_generic
0.55% benchmark [kernel.kallsyms] [k] smp_call_function_single
0.42% benchmark [kernel.kallsyms] [k] _raw_spin_lock_irqsave
0.42% benchmark benchmark [.] upb_psm_1bt_max256b
0.31% benchmark benchmark [.] upb_psb1_1bt
0.21% benchmark benchmark [.] upb_plv4_5bv
0.14% benchmark benchmark [.] upb_psb1_2bt
0.12% benchmark benchmark [.] decode_longvarint64
0.08% benchmark [kernel.kallsyms] [k] vsnprintf
0.07% benchmark [kernel.kallsyms] [k] _raw_spin_lock
0.07% benchmark benchmark [.] _upb_msg_new
0.06% benchmark ld-2.31.so [.] check_match
4 years ago
Joshua Haberman
52a0ed3891
Fixed a bug with tag number 15.
4 years ago
Joshua Haberman
9e5c5ce089
Optimized memset() with cutoff and fixed group & unknown message bugs.
4 years ago
Joshua Haberman
88b1ec7784
Table-driven supports repeated sub-messages.
4 years ago
Joshua Haberman
f173642db4
Handle non-repeated submessages.
4 years ago
Joshua Haberman
7ec2c52346
Donate/steal from arena to accelerate decoding.
4 years ago
Joshua Haberman
fac992db83
Cleanup for showing.
4 years ago
Joshua Haberman
438ecaeb5a
Give all field parsers a generic table entry.
4 years ago
Joshua Haberman
a77ea639d5
Verify UTF-8 when parsing proto3 string fields.
5 years ago
Joshua Haberman
8f11ec57d2
Applied changes from google3.
5 years ago
Joshua Haberman
b717575cef
Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors. ( #289 )
...
* Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors.
* Disable -Wshorten-32-to-64 since Kokoro is missing Clang.
* Fixed -Wextra warnings for gcc.
* Reordered UPB_UNUSED() to come after declarations.
* Added another -pedantic fix and log CC version.
* Fix compile error and conditionally run use_bazel.sh.
* Moved set -e after use_bazel.sh.
* Fixed typo in conditional.
5 years ago
Joshua Haberman
634d37515c
Bugfix for oneofs and added line/col info to JSON.
5 years ago
Joshua Haberman
543a0ce8f2
Fixes for PHP. ( #286 )
...
- A new PHP-specific upb amalgamation. It contains everything related to upb_msg, but leaves out all of the old handlers-related interfaces and encoders/decoders.
# Schema/Defs Changes
- Changed `upb_fielddef_msgsubdef()` and `upb_fielddef_enumsubdef()` to return `NULL` instead of assert-failing if the field is not a message or enum.
- Added `upb_msgdef_iswrapper()`, to test whether this is a wrapper well-known type.
# Decoder
- Decoder bugfix: when we parse a submessage inside a oneof, we need to clear out any previous data, so we don't misinterpret it as a pointer to an existing submessage.
# JSON Decoder
- Allowed well-known types at the top level to have their special processing.
- Fixed a bug that could occur when parsing nested empty lists/objects, eg `[[]]`.
- Made the "ignore unknown" option also be permissive about unknown enumerators by setting them to 0.
# JSON Encoder
- Allowed well-known types at the top level to have their special processing.
- Removed all spaces after `:` and `,` characters, to match the old encoder and pass goldenfile tests.
# Message / Reflection
- Changed `upb_msg_hasoneof()` -> `upb_msg_whichoneof()`. The new function returns the `upb_fielddef*` of whichever oneof is set.
- Implemented `upb_msg_clearfield()` and added/implemented `upb_msg_clear()`.
- Added `upb_msg_discardunknown()`. Part of me thinks this should go in a util library instead of core reflection since it is a recursive algorithm.
# Compiler
- Always emit descriptors as an array instead of as a string, to avoid exceeding maximum string lengths. If this becomes a speed issue later we can go back to two separate paths.
5 years ago
Joshua Haberman
0842f88211
Support for proto3 optional. ( #270 )
...
* Added support for proto3 optional to defs.
* Added proto3 optional support. Untested at the moment.
* Changes to support proto3 optional.
* Fixed real oneof count for messages with no fields.
* Fixed compile error and test.
* Added comment about why I'm commenting out the assert.
5 years ago
Joshua Haberman
38a1045975
Added a has_foo() generated method for proto3 submessage fields. ( #266 )
...
This is better than checking against NULL, because in the future
unset fields will (probably) return a default instance instead of
NULL.
5 years ago