Joshua Haberman
|
970c645140
|
Fixes for google3 (layering check and formatting).
|
3 years ago |
Joshua Haberman
|
8405436044
|
Addressed PR comments.
|
3 years ago |
Joshua Haberman
|
c944638b9b
|
Changed benchmarks:benchmark to cc_test(), so it can be run with "bazel test".
|
3 years ago |
Joshua Haberman
|
6af01a26a0
|
Ran clang-format.
|
3 years ago |
Joshua Haberman
|
4c9891bf3e
|
Renamed LoadDefInit_NoLayout() to LoadDefInit_BuildLayout().
This will clarify that the function should go with the
WithLayout benchmarks.
|
3 years ago |
Joshua Haberman
|
205a7ea8b1
|
Added a variant of the def-loading benchmark that builds layout dynamically.
This will help us evaluate the performance impacts (if any) that the
new mini-table building code will have.
|
3 years ago |
Joshua Haberman
|
1c955f37ce
|
Mass API rename and clang-reformat (#485)
* Wave 1: upb_fielddef.
* upb_fielddef itself.
* upb_oneofdef.
* upb_msgdef.
* ExtensionRange.
* upb_enumdef
* upb_enumvaldef
* upb_filedef
* upb_methoddef
* upb_servicedef
* upb_symtab
* upb_defpool_init
* upb_wellknown and upb_syntax_t
* Some constants.
* upb_status
* upb_strview
* upb_arena
* upb.h constants
* reflection
* encode
* JSON decode.
* json encode.
* msg_internal.
* Formatted with clang-format.
* Some naming fixups and comment reformatting.
* More refinements.
* A few more stragglers.
* Fixed PyObject_HEAD with semicolon. Removed TODO entries.
|
3 years ago |
Joshua Haberman
|
50978256b9
|
Properly byte-swap fixed packed fields.
|
3 years ago |
Joshua Haberman
|
4abe724dde
|
A few more fixes.
|
3 years ago |
Joshua Haberman
|
7907ed913b
|
Expanded the test to cover packed fields also.
|
3 years ago |
Joshua Haberman
|
2e1502a637
|
Set benchmark baseline back to master.
|
3 years ago |
Joshua Haberman
|
77c0381013
|
Interleave benchmark results.
|
3 years ago |
Joshua Haberman
|
eabb77458a
|
Fixes to make upb's tests compatible with a minimal Docker container.
|
3 years ago |
Joshua Haberman
|
c5d6ec737e
|
Removed unnecessary dependency.
|
3 years ago |
Joshua Haberman
|
41bfbca375
|
Updated ads benchmark to v7 as v5 no longer exists upstream.
|
3 years ago |
Joshua Haberman
|
cdd6434a31
|
Introduced upb_extreg and plumbed it into decoder.
|
4 years ago |
Joshua Haberman
|
823eb09694
|
Update all 2011 dates to 2021.
|
4 years ago |
Joshua Haberman
|
e59d2c8fa7
|
Added license headers to all files.
|
4 years ago |
Joshua Haberman
|
c358829c76
|
Now that handlers are gone, cleaned up table to use arenas exclusively.
Also cleaned up some cruft from table.
|
4 years ago |
Joshua Haberman
|
a04627abc8
|
Added map sorting to binary and text encoders.
For the binary encoder, sorting is off by default.
For the text encoder, sorting is on by default.
Both defaults can be explicitly overridden.
This grows code size a bit. I think we could potentially
shave this (and other map-related code size) by having
the generated code inject a function pointer to the map-related
parsing/serialization code if maps are present.
FILE SIZE VM SIZE
-------------- --------------
+86% +1.07Ki +71% +768 upb/msg.c
[NEW] +391 [NEW] +344 _upb_mapsorter_pushmap
[NEW] +158 [NEW] +112 _upb_mapsorter_cmpstr
[NEW] +111 [NEW] +64 _upb_mapsorter_cmpbool
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpi32
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpi64
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpu32
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpu64
-3.6% -8 -4.3% -8 _upb_map_new
+9.5% +464 +9.2% +424 upb/text_encode.c
[NEW] +656 [NEW] +616 txtenc_mapentry
+15% +32 +20% +32 upb_text_encode
-20.1% -224 -20.7% -224 txtenc_msg
+5.7% +342 +5.3% +296 upb/encode.c
[NEW] +344 [NEW] +304 encode_mapentry
[NEW] +246 [NEW] +208 upb_encode_ex
[NEW] +41 [NEW] +16 upb_encode_ex.ch
+0.7% +8 +0.7% +8 encode_scalar
-1.0% -32 -1.0% -32 encode_message
[DEL] -38 [DEL] -16 upb_encode.ch
[DEL] -227 [DEL] -192 upb_encode
+2.0% +152 +2.2% +152 upb/decode.c
+44% +128 +44% +128 [section .rodata]
+3.4% +24 +3.4% +24 _GLOBAL_OFFSET_TABLE_
+0.6% +107 +0.3% +48 upb/def.c
[NEW] +100 [NEW] +48 upb_fielddef_descriptortype
+7.1% +7 [ = ] 0 upb_fielddef_defaultint32
+2.9% +24 +2.9% +24 [section .dynsym]
+1.2% +24 [ = ] 0 [section .symtab]
+3.2% +16 +3.2% +16 [section .plt]
[NEW] +16 [NEW] +16 memcmp@plt
+0.5% +16 +0.6% +16 tests/conformance_upb.c
+1.5% +16 +1.6% +16 DoTestIo
+0.1% +16 +0.1% +16 upb/json_decode.c
+0.4% +16 +0.4% +16 jsondec_wellknown
+3.0% +8 +3.0% +8 [section .got.plt]
+3.0% +8 +3.0% +8 _GLOBAL_OFFSET_TABLE_
+1.6% +7 +1.6% +7 [section .dynstr]
+1.8% +4 +1.8% +4 [section .hash]
+0.5% +3 +0.5% +3 [LOAD #2 [RX]]
+2.8% +2 +2.8% +2 [section .gnu.version]
-60.0% -1.74Ki [ = ] 0 [Unmapped]
+0.3% +496 +1.4% +1.74Ki TOTAL
|
4 years ago |
Joshua Haberman
|
9abf8e043f
|
Clamp 32-bit varints to 5 bytes to fix a fuzz failure.
|
4 years ago |
Joshua Haberman
|
9c87f1168f
|
Added size benchmark for CODE_SIZE.
|
4 years ago |
Joshua Haberman
|
358fa14d0e
|
Fixed headers and updated benchmark script.
|
4 years ago |
Joshua Haberman
|
378a27b640
|
Force "size" to run locally.
|
4 years ago |
Joshua Haberman
|
da48e01f05
|
More google3 fixes.
|
4 years ago |
Joshua Haberman
|
d2446fd2db
|
Moved cc_api_version attribute to proto_library().
|
4 years ago |
Joshua Haberman
|
4a84390c89
|
Added cc_proto_library() tweaks for google3.
|
4 years ago |
Joshua Haberman
|
86f671d5fd
|
Fix for Darwin (output is different, but it won't error out).
|
4 years ago |
Joshua Haberman
|
165e01ec6f
|
Fix for old Python versions.
|
4 years ago |
Joshua Haberman
|
65d166a6ba
|
Added API for copy vs. alias and added benchmarks to test both.
Benchmark output:
$ bazel-bin/benchmarks/benchmark '--benchmark_filter=BM_Parse'
2020-11-11 15:39:04
Running bazel-bin/benchmarks/benchmark
Run on (72 X 3700 MHz CPU s)
CPU Caches:
L1 Data 32K (x36)
L1 Instruction 32K (x36)
L2 Unified 1024K (x36)
L3 Unified 25344K (x2)
-------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc<UseArena, Copy> 4134 ns 4134 ns 168714 1.69152GB/s
BM_Parse_Upb_FileDesc<UseArena, Alias> 3487 ns 3487 ns 199509 2.00526GB/s
BM_Parse_Upb_FileDesc<InitBlock, Copy> 3727 ns 3726 ns 187581 1.87643GB/s
BM_Parse_Upb_FileDesc<InitBlock, Alias> 3110 ns 3110 ns 224970 2.24866GB/s
BM_Parse_Proto2<FileDesc, NoArena, Copy> 31132 ns 31132 ns 22437 229.995MB/s
BM_Parse_Proto2<FileDesc, UseArena, Copy> 21011 ns 21009 ns 33922 340.812MB/s
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 17976 ns 17975 ns 38808 398.337MB/s
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 17357 ns 17356 ns 40244 412.539MB/s
|
4 years ago |
Joshua Haberman
|
881ddac7fe
|
Also use .format() for gen_synthetic_protos.py.
|
4 years ago |
Joshua Haberman
|
8b7dabe1a2
|
Use format() instead of string interpolation, for old Python versions.
|
4 years ago |
Joshua Haberman
|
8e08282c3b
|
Removed unused small.proto.
|
4 years ago |
Joshua Haberman
|
0f79d47215
|
Added missing lite binaries to size_data.txt.
|
4 years ago |
Joshua Haberman
|
555fbbc0bc
|
Size benchmarks are working pretty well.
|
4 years ago |
Joshua Haberman
|
e5bdfba92c
|
Removed accidentally-added .orig file.
|
4 years ago |
Joshua Haberman
|
1eb7bd39e7
|
Some formatting fixes.
|
4 years ago |
Joshua Haberman
|
4bd34da105
|
WIP.
|
4 years ago |
Joshua Haberman
|
7b4e376f79
|
Switch unordered_set -> absl::flat_hash_set.
|
4 years ago |
Joshua Haberman
|
fe62fc83e1
|
Removed obsolete includes in benchmark.
|
4 years ago |
Joshua Haberman
|
5b1f0d86a1
|
For Kokoro, only build/test -m32 on Linux.
Also fixed a bunch of bugs found by gcc's -fanalyzer.
|
4 years ago |
Joshua Haberman
|
64abb5eb11
|
Amalgamation no longer bundles wyhash, but #includes it.
Also fixed a few spelling mistakes.
|
4 years ago |
Joshua Haberman
|
5ec1d39224
|
Avoid building .pb.cc for ads protos, as C++ takes forever to compile.
|
4 years ago |
Joshua Haberman
|
c3b5637646
|
Added benchmark for loading ads descriptor.
Generally this seems to track the speed of loading descriptor.proto.
----------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------------
BM_LoadDescriptor_Upb 59091 ns 59086 ns 11747 121.182MB/s
BM_LoadAdsDescriptor_Upb 4218587 ns 4218582 ns 166 120.544MB/s
BM_LoadDescriptor_Proto2 241083 ns 241049 ns 2903 29.7043MB/s
BM_LoadAdsDescriptor_Proto2 13442631 ns 13442099 ns 52 34.8975MB/s
|
4 years ago |
Joshua Haberman
|
723cd8ffc1
|
Added wyhash code and LICENSE, and removed temporary benchmark.
|
4 years ago |
Joshua Haberman
|
154f2c25f4
|
Added UTF-8 validation for proto3 string fields.
|
4 years ago |
Joshua Haberman
|
d81ba58215
|
Optimized short string copying.
This sped up the alias=false case:
Before:
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc_WithInitialBlock 4562 ns 4562 ns 153251 1.53276GB/s
Performance counter stats for 'bazel-bin/benchmarks/benchmark --benchmark_filter=BM_Parse_Upb_FileDesc_WithInitialBlock':
1,216.65 msec task-clock # 0.936 CPUs utilized
6 context-switches # 0.005 K/sec
0 cpu-migrations # 0.000 K/sec
200 page-faults # 0.164 K/sec
4,490,925,650 cycles # 3.691 GHz
16,516,403,731 instructions # 3.68 insn per cycle
2,828,536,650 branches # 2324.861 M/sec
5,425,830 branch-misses # 0.19% of all branches
1.300178903 seconds time elapsed
1.211475000 seconds user
0.072207000 seconds sys
After:
------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc_WithInitialBlock 3587 ns 3587 ns 195749 1.94935GB/s
Performance counter stats for 'bazel-bin/benchmarks/benchmark --benchmark_filter=BM_Parse_Upb_FileDesc_WithInitialBlock':
1,109.69 msec task-clock # 0.930 CPUs utilized
5 context-switches # 0.005 K/sec
0 cpu-migrations # 0.000 K/sec
198 page-faults # 0.178 K/sec
4,094,010,257 cycles # 3.689 GHz
15,672,677,812 instructions # 3.83 insn per cycle
2,589,291,160 branches # 2333.346 M/sec
3,306,386 branch-misses # 0.13% of all branches
1.193221789 seconds time elapsed
1.102538000 seconds user
0.072166000 seconds sys
|
4 years ago |
Joshua Haberman
|
8a3470c543
|
WIP.
|
4 years ago |
Joshua Haberman
|
a7e2e8338d
|
Fixed benchmark script.
|
4 years ago |
Joshua Haberman
|
2c1664906a
|
Removed license comments and upb_amalgamation for google3.
|
4 years ago |