Joshua Haberman
a04627abc8
Added map sorting to binary and text encoders.
...
For the binary encoder, sorting is off by default.
For the text encoder, sorting is on by default.
Both defaults can be explicitly overridden.
This grows code size a bit. I think we could potentially
shave this (and other map-related code size) by having
the generated code inject a function pointer to the map-related
parsing/serialization code if maps are present.
FILE SIZE VM SIZE
-------------- --------------
+86% +1.07Ki +71% +768 upb/msg.c
[NEW] +391 [NEW] +344 _upb_mapsorter_pushmap
[NEW] +158 [NEW] +112 _upb_mapsorter_cmpstr
[NEW] +111 [NEW] +64 _upb_mapsorter_cmpbool
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpi32
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpi64
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpu32
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpu64
-3.6% -8 -4.3% -8 _upb_map_new
+9.5% +464 +9.2% +424 upb/text_encode.c
[NEW] +656 [NEW] +616 txtenc_mapentry
+15% +32 +20% +32 upb_text_encode
-20.1% -224 -20.7% -224 txtenc_msg
+5.7% +342 +5.3% +296 upb/encode.c
[NEW] +344 [NEW] +304 encode_mapentry
[NEW] +246 [NEW] +208 upb_encode_ex
[NEW] +41 [NEW] +16 upb_encode_ex.ch
+0.7% +8 +0.7% +8 encode_scalar
-1.0% -32 -1.0% -32 encode_message
[DEL] -38 [DEL] -16 upb_encode.ch
[DEL] -227 [DEL] -192 upb_encode
+2.0% +152 +2.2% +152 upb/decode.c
+44% +128 +44% +128 [section .rodata]
+3.4% +24 +3.4% +24 _GLOBAL_OFFSET_TABLE_
+0.6% +107 +0.3% +48 upb/def.c
[NEW] +100 [NEW] +48 upb_fielddef_descriptortype
+7.1% +7 [ = ] 0 upb_fielddef_defaultint32
+2.9% +24 +2.9% +24 [section .dynsym]
+1.2% +24 [ = ] 0 [section .symtab]
+3.2% +16 +3.2% +16 [section .plt]
[NEW] +16 [NEW] +16 memcmp@plt
+0.5% +16 +0.6% +16 tests/conformance_upb.c
+1.5% +16 +1.6% +16 DoTestIo
+0.1% +16 +0.1% +16 upb/json_decode.c
+0.4% +16 +0.4% +16 jsondec_wellknown
+3.0% +8 +3.0% +8 [section .got.plt]
+3.0% +8 +3.0% +8 _GLOBAL_OFFSET_TABLE_
+1.6% +7 +1.6% +7 [section .dynstr]
+1.8% +4 +1.8% +4 [section .hash]
+0.5% +3 +0.5% +3 [LOAD #2 [RX]]
+2.8% +2 +2.8% +2 [section .gnu.version]
-60.0% -1.74Ki [ = ] 0 [Unmapped]
+0.3% +496 +1.4% +1.74Ki TOTAL
4 years ago
Joshua Haberman
9abf8e043f
Clamp 32-bit varints to 5 bytes to fix a fuzz failure.
4 years ago
Joshua Haberman
9c87f1168f
Added size benchmark for CODE_SIZE.
4 years ago
Joshua Haberman
358fa14d0e
Fixed headers and updated benchmark script.
4 years ago
Joshua Haberman
378a27b640
Force "size" to run locally.
4 years ago
Joshua Haberman
da48e01f05
More google3 fixes.
4 years ago
Joshua Haberman
d2446fd2db
Moved cc_api_version attribute to proto_library().
4 years ago
Joshua Haberman
4a84390c89
Added cc_proto_library() tweaks for google3.
4 years ago
Joshua Haberman
86f671d5fd
Fix for Darwin (output is different, but it won't error out).
4 years ago
Joshua Haberman
165e01ec6f
Fix for old Python versions.
4 years ago
Joshua Haberman
65d166a6ba
Added API for copy vs. alias and added benchmarks to test both.
...
Benchmark output:
$ bazel-bin/benchmarks/benchmark '--benchmark_filter=BM_Parse'
2020-11-11 15:39:04
Running bazel-bin/benchmarks/benchmark
Run on (72 X 3700 MHz CPU s)
CPU Caches:
L1 Data 32K (x36)
L1 Instruction 32K (x36)
L2 Unified 1024K (x36)
L3 Unified 25344K (x2)
-------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc<UseArena, Copy> 4134 ns 4134 ns 168714 1.69152GB/s
BM_Parse_Upb_FileDesc<UseArena, Alias> 3487 ns 3487 ns 199509 2.00526GB/s
BM_Parse_Upb_FileDesc<InitBlock, Copy> 3727 ns 3726 ns 187581 1.87643GB/s
BM_Parse_Upb_FileDesc<InitBlock, Alias> 3110 ns 3110 ns 224970 2.24866GB/s
BM_Parse_Proto2<FileDesc, NoArena, Copy> 31132 ns 31132 ns 22437 229.995MB/s
BM_Parse_Proto2<FileDesc, UseArena, Copy> 21011 ns 21009 ns 33922 340.812MB/s
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 17976 ns 17975 ns 38808 398.337MB/s
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 17357 ns 17356 ns 40244 412.539MB/s
4 years ago
Joshua Haberman
881ddac7fe
Also use .format() for gen_synthetic_protos.py.
4 years ago
Joshua Haberman
8b7dabe1a2
Use format() instead of string interpolation, for old Python versions.
4 years ago
Joshua Haberman
8e08282c3b
Removed unused small.proto.
4 years ago
Joshua Haberman
0f79d47215
Added missing lite binaries to size_data.txt.
4 years ago
Joshua Haberman
555fbbc0bc
Size benchmarks are working pretty well.
4 years ago
Joshua Haberman
e5bdfba92c
Removed accidentally-added .orig file.
4 years ago
Joshua Haberman
1eb7bd39e7
Some formatting fixes.
4 years ago
Joshua Haberman
4bd34da105
WIP.
4 years ago
Joshua Haberman
7b4e376f79
Switch unordered_set -> absl::flat_hash_set.
4 years ago
Joshua Haberman
fe62fc83e1
Removed obsolete includes in benchmark.
4 years ago
Joshua Haberman
5b1f0d86a1
For Kokoro, only build/test -m32 on Linux.
...
Also fixed a bunch of bugs found by gcc's -fanalyzer.
4 years ago
Joshua Haberman
64abb5eb11
Amalgamation no longer bundles wyhash, but #includes it.
...
Also fixed a few spelling mistakes.
4 years ago
Joshua Haberman
5ec1d39224
Avoid building .pb.cc for ads protos, as C++ takes forever to compile.
4 years ago
Joshua Haberman
c3b5637646
Added benchmark for loading ads descriptor.
...
Generally this seems to track the speed of loading descriptor.proto.
----------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------------
BM_LoadDescriptor_Upb 59091 ns 59086 ns 11747 121.182MB/s
BM_LoadAdsDescriptor_Upb 4218587 ns 4218582 ns 166 120.544MB/s
BM_LoadDescriptor_Proto2 241083 ns 241049 ns 2903 29.7043MB/s
BM_LoadAdsDescriptor_Proto2 13442631 ns 13442099 ns 52 34.8975MB/s
4 years ago
Joshua Haberman
723cd8ffc1
Added wyhash code and LICENSE, and removed temporary benchmark.
4 years ago
Joshua Haberman
154f2c25f4
Added UTF-8 validation for proto3 string fields.
4 years ago
Joshua Haberman
d81ba58215
Optimized short string copying.
...
This sped up the alias=false case:
Before:
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc_WithInitialBlock 4562 ns 4562 ns 153251 1.53276GB/s
Performance counter stats for 'bazel-bin/benchmarks/benchmark --benchmark_filter=BM_Parse_Upb_FileDesc_WithInitialBlock':
1,216.65 msec task-clock # 0.936 CPUs utilized
6 context-switches # 0.005 K/sec
0 cpu-migrations # 0.000 K/sec
200 page-faults # 0.164 K/sec
4,490,925,650 cycles # 3.691 GHz
16,516,403,731 instructions # 3.68 insn per cycle
2,828,536,650 branches # 2324.861 M/sec
5,425,830 branch-misses # 0.19% of all branches
1.300178903 seconds time elapsed
1.211475000 seconds user
0.072207000 seconds sys
After:
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc_WithInitialBlock 3587 ns 3587 ns 195749 1.94935GB/s
Performance counter stats for 'bazel-bin/benchmarks/benchmark --benchmark_filter=BM_Parse_Upb_FileDesc_WithInitialBlock':
1,109.69 msec task-clock # 0.930 CPUs utilized
5 context-switches # 0.005 K/sec
0 cpu-migrations # 0.000 K/sec
198 page-faults # 0.178 K/sec
4,094,010,257 cycles # 3.689 GHz
15,672,677,812 instructions # 3.83 insn per cycle
2,589,291,160 branches # 2333.346 M/sec
3,306,386 branch-misses # 0.13% of all branches
1.193221789 seconds time elapsed
1.102538000 seconds user
0.072166000 seconds sys
4 years ago
Joshua Haberman
8a3470c543
WIP.
4 years ago
Joshua Haberman
a7e2e8338d
Fixed benchmark script.
4 years ago
Joshua Haberman
2c1664906a
Removed license comments and upb_amalgamation for google3.
4 years ago
Joshua Haberman
b7dc77415a
Added licenses() to all BUILD files.
4 years ago
Joshua Haberman
e3f41de6c7
Split monolithic BUILD file into many build files.
4 years ago
Josh Haberman
56913be6bb
Removed obsolete benchmarks/ and examples/ directories.
10 years ago
Josh Haberman
0fd2f83088
Sync to internal Google development.
11 years ago
Josh Haberman
26d98ca94f
Merge from Google-internal development:
...
- rewritten decoder; interpreted decoder is bytecode-based,
JIT decoder no longer falls back to the interpreter.
- C++ improvements: C++11-compatible iterators, upb::reffed_ptr
for RAII refcounting, better upcast/downcast support.
- removed the gross upb_value abstraction from public upb.h.
11 years ago
Josh Haberman
bada1e94f4
Merge from Google-internal development.
...
- Better error reporting for upb::Def setters.
- error reporting for upb::Handlers setters.
- made the start/endmsg handlers a little less special-cased.
12 years ago
Josh Haberman
ee3a3191cd
Updated benchmarks to new API.
12 years ago
Joshua Haberman
622481990b
Updated benchmarks to new APIs.
12 years ago
Josh Haberman
7d3e2bd2c4
Sync with 8 months of Google-internal development.
...
Many things have changed and been simplified.
The memory-management story for upb_def and upb_handlers
is much more robust; upb_def and upb_handlers should be
fairly stable interfaces now. There is still much work
to do for the runtime component (upb_sink).
12 years ago
Joshua Haberman
cca4818eb7
Sync from internal Google development.
13 years ago
Joshua Haberman
86bad61b76
Sync from internal Google development.
...
Many improvements, too many to mention. One significant
perf regression warrants investigation:
omitfp.parsetoproto2_googlemessage1.upb_jit: 343 -> 252 (-26.53)
plain.parsetoproto2_googlemessage1.upb_jit: 334 -> 251 (-24.85)
25% regression for this benchmark is bad, but since I don't think
there's any fundamental design issue that caused it I'm going to
go ahead with the commit anyway. Can investigate and fix later.
Other benchmarks were neutral or showed slight improvement.
13 years ago
Joshua Haberman
1b9b6bd1ad
Fixed the open-source build.
13 years ago
Joshua Haberman
1bcab1377d
Sync with internal Google development.
...
This breaks the open-source build, will
follow up with a change to fix it.
13 years ago
Joshua Haberman
b5f5ee867e
Refinement of upb_bytesrc interface.
...
Added a upb_byteregion that tracks a region of
the input buffer; decoders use this instead of
using a upb_bytesrc directly. upb_byteregion
is also used as the way of passing a string to
a upb_handlers callback. This symmetry makes
decoders compose better; if you want to take
a parsed string and decode it as something else,
you can take the string directly from the callback
and feed it as input to another parser.
A commented-out version of a pinning interface
is present; I decline to actually implement it
(and accept its extra complexity) until/unless
it is clear that it is actually a win. But it
is included as a proof-of-concept, to show that
it fits well with the existing interface.
13 years ago
Joshua Haberman
621c0cdcb5
Const invasion: large parts of upb made const-correct.
13 years ago
Joshua Haberman
4a8b9be46c
Header cleanup, clarify/correct comments for interfaces.
13 years ago
Joshua Haberman
521ac7a89a
Refined upb_status.
13 years ago
Joshua Haberman
e8796beffc
Add comment clarifying that the proto2 benchmark is ugly and temporary.
13 years ago
Joshua Haberman
adb6580d97
Let the JIT emit hasbit-setting code in addition to calling a callback.
...
This leads to a major (20-40%) improvement in the parsetoproto2
benchmark with small messages. We now are faster than proto2 in all
apples-to-apples comparisons, at least given the (admittedly
limited) set of benchmarks in this source tree.
13 years ago