Joshua Haberman
e59d2c8fa7
Added license headers to all files.
4 years ago
Joshua Haberman
a04627abc8
Added map sorting to binary and text encoders.
...
For the binary encoder, sorting is off by default.
For the text encoder, sorting is on by default.
Both defaults can be explicitly overridden.
This grows code size a bit. I think we could potentially
shave this (and other map-related code size) by having
the generated code inject a function pointer to the map-related
parsing/serialization code if maps are present.
FILE SIZE VM SIZE
-------------- --------------
+86% +1.07Ki +71% +768 upb/msg.c
[NEW] +391 [NEW] +344 _upb_mapsorter_pushmap
[NEW] +158 [NEW] +112 _upb_mapsorter_cmpstr
[NEW] +111 [NEW] +64 _upb_mapsorter_cmpbool
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpi32
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpi64
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpu32
[NEW] +110 [NEW] +64 _upb_mapsorter_cmpu64
-3.6% -8 -4.3% -8 _upb_map_new
+9.5% +464 +9.2% +424 upb/text_encode.c
[NEW] +656 [NEW] +616 txtenc_mapentry
+15% +32 +20% +32 upb_text_encode
-20.1% -224 -20.7% -224 txtenc_msg
+5.7% +342 +5.3% +296 upb/encode.c
[NEW] +344 [NEW] +304 encode_mapentry
[NEW] +246 [NEW] +208 upb_encode_ex
[NEW] +41 [NEW] +16 upb_encode_ex.ch
+0.7% +8 +0.7% +8 encode_scalar
-1.0% -32 -1.0% -32 encode_message
[DEL] -38 [DEL] -16 upb_encode.ch
[DEL] -227 [DEL] -192 upb_encode
+2.0% +152 +2.2% +152 upb/decode.c
+44% +128 +44% +128 [section .rodata]
+3.4% +24 +3.4% +24 _GLOBAL_OFFSET_TABLE_
+0.6% +107 +0.3% +48 upb/def.c
[NEW] +100 [NEW] +48 upb_fielddef_descriptortype
+7.1% +7 [ = ] 0 upb_fielddef_defaultint32
+2.9% +24 +2.9% +24 [section .dynsym]
+1.2% +24 [ = ] 0 [section .symtab]
+3.2% +16 +3.2% +16 [section .plt]
[NEW] +16 [NEW] +16 memcmp@plt
+0.5% +16 +0.6% +16 tests/conformance_upb.c
+1.5% +16 +1.6% +16 DoTestIo
+0.1% +16 +0.1% +16 upb/json_decode.c
+0.4% +16 +0.4% +16 jsondec_wellknown
+3.0% +8 +3.0% +8 [section .got.plt]
+3.0% +8 +3.0% +8 _GLOBAL_OFFSET_TABLE_
+1.6% +7 +1.6% +7 [section .dynstr]
+1.8% +4 +1.8% +4 [section .hash]
+0.5% +3 +0.5% +3 [LOAD #2 [RX]]
+2.8% +2 +2.8% +2 [section .gnu.version]
-60.0% -1.74Ki [ = ] 0 [Unmapped]
+0.3% +496 +1.4% +1.74Ki TOTAL
4 years ago
Joshua Haberman
65d166a6ba
Added API for copy vs. alias and added benchmarks to test both.
...
Benchmark output:
$ bazel-bin/benchmarks/benchmark '--benchmark_filter=BM_Parse'
2020-11-11 15:39:04
Running bazel-bin/benchmarks/benchmark
Run on (72 X 3700 MHz CPU s)
CPU Caches:
L1 Data 32K (x36)
L1 Instruction 32K (x36)
L2 Unified 1024K (x36)
L3 Unified 25344K (x2)
-------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------
BM_Parse_Upb_FileDesc<UseArena, Copy> 4134 ns 4134 ns 168714 1.69152GB/s
BM_Parse_Upb_FileDesc<UseArena, Alias> 3487 ns 3487 ns 199509 2.00526GB/s
BM_Parse_Upb_FileDesc<InitBlock, Copy> 3727 ns 3726 ns 187581 1.87643GB/s
BM_Parse_Upb_FileDesc<InitBlock, Alias> 3110 ns 3110 ns 224970 2.24866GB/s
BM_Parse_Proto2<FileDesc, NoArena, Copy> 31132 ns 31132 ns 22437 229.995MB/s
BM_Parse_Proto2<FileDesc, UseArena, Copy> 21011 ns 21009 ns 33922 340.812MB/s
BM_Parse_Proto2<FileDesc, InitBlock, Copy> 17976 ns 17975 ns 38808 398.337MB/s
BM_Parse_Proto2<FileDescSV, InitBlock, Alias> 17357 ns 17356 ns 40244 412.539MB/s
4 years ago
Joshua Haberman
bf8e08074c
Added a few more comments.
4 years ago
Joshua Haberman
2a574d3d01
Added a bunch of comments for readability.
4 years ago
Joshua Haberman
9938cf8f27
Put submsg_index directly in table data. Drop oneof support for now to focus.
4 years ago
Joshua Haberman
d87179501d
Another build fix.
4 years ago
Joshua Haberman
89bd8b87e1
Fixed a few more C89 compat issues.
4 years ago
Joshua Haberman
ff957b996c
Fixed C89 compat issues.
4 years ago
Joshua Haberman
537b6f42c2
A few updates to the benchamrk and minor implementation changes.
4 years ago
Joshua Haberman
526e430794
I think this may have reached the optimization limit.
...
-------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------
BM_ArenaOneAlloc 21 ns 21 ns 32994231
BM_ArenaInitialBlockOneAlloc 6 ns 6 ns 116318005
BM_ParseDescriptorNoHeap 3028 ns 3028 ns 231138 2.34354GB/s
BM_ParseDescriptor 3557 ns 3557 ns 196583 1.99498GB/s
BM_ParseDescriptorProto2NoArena 33228 ns 33226 ns 21196 218.688MB/s
BM_ParseDescriptorProto2WithArena 22863 ns 22861 ns 30666 317.831MB/s
BM_SerializeDescriptorProto2 5444 ns 5444 ns 127368 1.30348GB/s
BM_SerializeDescriptor 12509 ns 12508 ns 55816 580.914MB/s
$ perf stat bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap
2020-10-08 14:07:06
Running bazel-bin/benchmark
Run on (72 X 3700 MHz CPU s)
CPU Caches:
L1 Data 32K (x36)
L1 Instruction 32K (x36)
L2 Unified 1024K (x36)
L3 Unified 25344K (x2)
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_ParseDescriptorNoHeap 3071 ns 3071 ns 227743 2.31094GB/s
Performance counter stats for 'bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap':
1,050.22 msec task-clock # 0.978 CPUs utilized
4 context-switches # 0.004 K/sec
0 cpu-migrations # 0.000 K/sec
179 page-faults # 0.170 K/sec
3,875,796,334 cycles # 3.690 GHz
13,282,835,967 instructions # 3.43 insn per cycle
2,887,725,848 branches # 2749.627 M/sec
8,324,912 branch-misses # 0.29% of all branches
1.073924364 seconds time elapsed
1.042806000 seconds user
0.008021000 seconds sys
Profile:
23.96% benchmark benchmark [.] upb_prm_1bt_max192b
22.44% benchmark benchmark [.] fastdecode_dispatch
18.96% benchmark benchmark [.] upb_pss_1bt
14.20% benchmark benchmark [.] upb_psv4_1bt
8.33% benchmark benchmark [.] upb_prm_1bt_max64b
6.66% benchmark benchmark [.] upb_prm_1bt_max128b
1.29% benchmark benchmark [.] upb_psm_1bt_max64b
0.77% benchmark benchmark [.] fastdecode_generic
0.55% benchmark [kernel.kallsyms] [k] smp_call_function_single
0.42% benchmark [kernel.kallsyms] [k] _raw_spin_lock_irqsave
0.42% benchmark benchmark [.] upb_psm_1bt_max256b
0.31% benchmark benchmark [.] upb_psb1_1bt
0.21% benchmark benchmark [.] upb_plv4_5bv
0.14% benchmark benchmark [.] upb_psb1_2bt
0.12% benchmark benchmark [.] decode_longvarint64
0.08% benchmark [kernel.kallsyms] [k] vsnprintf
0.07% benchmark [kernel.kallsyms] [k] _raw_spin_lock
0.07% benchmark benchmark [.] _upb_msg_new
0.06% benchmark ld-2.31.so [.] check_match
4 years ago
Joshua Haberman
9e5c5ce089
Optimized memset() with cutoff and fixed group & unknown message bugs.
4 years ago
Joshua Haberman
88b1ec7784
Table-driven supports repeated sub-messages.
4 years ago
Joshua Haberman
f173642db4
Handle non-repeated submessages.
4 years ago
Joshua Haberman
fac992db83
Cleanup for showing.
4 years ago
Joshua Haberman
3937874a85
We have a properly structured algorithm, but perf regresses by 20%.
4 years ago
Joshua Haberman
438ecaeb5a
Give all field parsers a generic table entry.
4 years ago
Joshua Haberman
26abaa2345
WIP.
4 years ago
Joshua Haberman
763a3f6293
WIP.
4 years ago
Joshua Haberman
ba0a2fb955
Compiles, doesn't work yet.
6 years ago
Josh Haberman
32e3f394b4
A few small API tweaks.
...
- Foo_parsenew() -> Foo_parse().
- parse function takes plain (const char*, size_t) instead of
upb_strview. The latter is mainly useful for strings inside
message objects.
6 years ago
Joshua Haberman
cb26d883d1
WIP.
6 years ago
Josh Haberman
9ea6bb4678
Renamed upb_stringview -> upb_strview for C terseness.
6 years ago
Joshua Haberman
694d51f4d6
Changed C API to only define structs, a table, and a few minimal inline functions.
6 years ago
Josh Haberman
e94ac4f757
Moved upb_msg parts that depend on def to a separate msgfactory.{c,h}.
...
Also got rid of the premature "v1" business that was attempting
to create a binary compatibility story.
Also added an in-progress CMakeLists.txt file.
6 years ago
Josh Haberman
1aafd4111b
A good start on upb_encode and upb_decode.
7 years ago