Joshua Haberman
64d293894a
Fixed bug introduced by last optimization.
4 years ago
Joshua Haberman
ff957b996c
Fixed C89 compat issues.
4 years ago
Joshua Haberman
537b6f42c2
A few updates to the benchamrk and minor implementation changes.
4 years ago
Joshua Haberman
0dcc5641eb
Replicated dispatch and implemeted array resizing logic. Up to 2.67GB/s.
4 years ago
Joshua Haberman
526e430794
I think this may have reached the optimization limit.
...
-------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------
BM_ArenaOneAlloc 21 ns 21 ns 32994231
BM_ArenaInitialBlockOneAlloc 6 ns 6 ns 116318005
BM_ParseDescriptorNoHeap 3028 ns 3028 ns 231138 2.34354GB/s
BM_ParseDescriptor 3557 ns 3557 ns 196583 1.99498GB/s
BM_ParseDescriptorProto2NoArena 33228 ns 33226 ns 21196 218.688MB/s
BM_ParseDescriptorProto2WithArena 22863 ns 22861 ns 30666 317.831MB/s
BM_SerializeDescriptorProto2 5444 ns 5444 ns 127368 1.30348GB/s
BM_SerializeDescriptor 12509 ns 12508 ns 55816 580.914MB/s
$ perf stat bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap
2020-10-08 14:07:06
Running bazel-bin/benchmark
Run on (72 X 3700 MHz CPU s)
CPU Caches:
L1 Data 32K (x36)
L1 Instruction 32K (x36)
L2 Unified 1024K (x36)
L3 Unified 25344K (x2)
----------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_ParseDescriptorNoHeap 3071 ns 3071 ns 227743 2.31094GB/s
Performance counter stats for 'bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap':
1,050.22 msec task-clock # 0.978 CPUs utilized
4 context-switches # 0.004 K/sec
0 cpu-migrations # 0.000 K/sec
179 page-faults # 0.170 K/sec
3,875,796,334 cycles # 3.690 GHz
13,282,835,967 instructions # 3.43 insn per cycle
2,887,725,848 branches # 2749.627 M/sec
8,324,912 branch-misses # 0.29% of all branches
1.073924364 seconds time elapsed
1.042806000 seconds user
0.008021000 seconds sys
Profile:
23.96% benchmark benchmark [.] upb_prm_1bt_max192b
22.44% benchmark benchmark [.] fastdecode_dispatch
18.96% benchmark benchmark [.] upb_pss_1bt
14.20% benchmark benchmark [.] upb_psv4_1bt
8.33% benchmark benchmark [.] upb_prm_1bt_max64b
6.66% benchmark benchmark [.] upb_prm_1bt_max128b
1.29% benchmark benchmark [.] upb_psm_1bt_max64b
0.77% benchmark benchmark [.] fastdecode_generic
0.55% benchmark [kernel.kallsyms] [k] smp_call_function_single
0.42% benchmark [kernel.kallsyms] [k] _raw_spin_lock_irqsave
0.42% benchmark benchmark [.] upb_psm_1bt_max256b
0.31% benchmark benchmark [.] upb_psb1_1bt
0.21% benchmark benchmark [.] upb_plv4_5bv
0.14% benchmark benchmark [.] upb_psb1_2bt
0.12% benchmark benchmark [.] decode_longvarint64
0.08% benchmark [kernel.kallsyms] [k] vsnprintf
0.07% benchmark [kernel.kallsyms] [k] _raw_spin_lock
0.07% benchmark benchmark [.] _upb_msg_new
0.06% benchmark ld-2.31.so [.] check_match
4 years ago
Joshua Haberman
4c65b25daf
Handle long varints, now 2GB/s!
4 years ago
Joshua Haberman
e39ec95ca2
Hoisted updates to limits and depth out of the loop.
4 years ago
Joshua Haberman
388b6f64eb
A small optimization: don't increment array length every iteration.
4 years ago
Joshua Haberman
9e5c5ce089
Optimized memset() with cutoff and fixed group & unknown message bugs.
4 years ago
Joshua Haberman
8dd7b5a2ca
A bunch more optimization.
4 years ago
Joshua Haberman
405e7934b1
Handle 2-byte submessage lengths.
4 years ago
Joshua Haberman
88b1ec7784
Table-driven supports repeated sub-messages.
4 years ago
Joshua Haberman
f173642db4
Handle non-repeated submessages.
4 years ago
Joshua Haberman
7ec2c52346
Donate/steal from arena to accelerate decoding.
4 years ago
Joshua Haberman
fac992db83
Cleanup for showing.
4 years ago
Joshua Haberman
3937874a85
We have a properly structured algorithm, but perf regresses by 20%.
4 years ago
Joshua Haberman
438ecaeb5a
Give all field parsers a generic table entry.
4 years ago
Joshua Haberman
383ae5293e
WIP.
4 years ago
Joshua Haberman
26abaa2345
WIP.
4 years ago
Joshua Haberman
34b98bc030
Avoid passing too many params to fallback.
4 years ago
Joshua Haberman
763a3f6293
WIP.
4 years ago
Joshua Haberman
a202ce9629
Add UPB_FORCEINLINE for varint32 decoding.
...
This speeds up the decoder by >20% and also reduces code size slightly!
name old time/op new time/op delta
ArenaOneAlloc 20.4ns ± 0% 20.2ns ± 0% -1.10% (p=0.000 n=12+11)
ArenaInitialBlockOneAlloc 5.25ns ± 0% 5.25ns ± 0% ~ (p=0.786 n=11+12)
ParseDescriptorNoHeap 17.1µs ± 0% 13.1µs ± 0% -23.29% (p=0.000 n=11+12)
ParseDescriptor 17.4µs ± 1% 13.5µs ± 1% -22.51% (p=0.000 n=12+12)
SerializeDescriptor 10.7µs ± 0% 10.9µs ± 0% +1.95% (p=0.000 n=12+12)
FILE SIZE VM SIZE
-------------- --------------
+2.7% +16 +2.7% +16 [LOAD #2 [RX]]
+0.5% +16 [ = ] 0 [Unmapped]
-1.4% -72 -0.7% -32 upb/decode.c
+3.1% +98 +3.1% +98 decode_msg
[DEL] -170 [DEL] -130 decode_varint32
-0.0% -40 -0.0% -16 TOTAL
4 years ago
Joshua Haberman
5741eb9ad7
Expanded benchmarking script and added one size opt to the encoder.
4 years ago
Joshua Haberman
0135399e60
Fixed bug introduced in refactoring.
4 years ago
Joshua Haberman
df3438222b
Notated impossible branch as unreachable.
4 years ago
Joshua Haberman
9b31e8fe12
Merged common encode tag paths.
4 years ago
Joshua Haberman
5d7dc718cc
Minor formatting fix.
4 years ago
Joshua Haberman
80441e4eb4
Optimized binary encoder.
4 years ago
Joshua Haberman
ada28896b9
Changed encoder to use longjmp() for error recovery.
4 years ago
Esun Kim
4d2251c3e4
Add UPB_NORETURN for MSC
4 years ago
Joshua Haberman
efefbffc80
Fixed binary encoding and decoding for big-endian machines.
4 years ago
Joshua Haberman
55dd9d3e41
Fixed UPB_ASSUME() for non-GCC, non-MSVC platforms.
4 years ago
Joshua Haberman
8284321780
Fixed upb_fielddef_packed() to have the correct default.
4 years ago
Joshua Haberman
8e26a33bcb
Added a test for UTF-8 parse checking and added missing error reporting.
4 years ago
Joshua Haberman
2c666bc8f6
Use C-style comment instead of C++.
4 years ago
Joshua Haberman
a77ea639d5
Verify UTF-8 when parsing proto3 string fields.
4 years ago
Joshua Haberman
bfdfe5a914
Removed unused push/pop functions.
5 years ago
Joshua Haberman
8f11ec57d2
Applied changes from google3.
5 years ago
Joshua Haberman
086a68d191
Fixed memory leak that could occur after upb_arena_fuse().
...
Also added valgrind testing for Kokoro.
5 years ago
Joshua Haberman
35abcc248b
Added test that should trigger a memory leak.
5 years ago
Joshua Haberman
7d726c8da6
JSON parser: Bugfix for float/double in quotes.
5 years ago
Joshua Haberman
efe11c6c50
Removed excess logging statement.
5 years ago
Joshua Haberman
e179dda212
Added initialization of all members to satisfy compiler warnings.
5 years ago
Joshua Haberman
81c2aa753e
Fixes for the PHP C Extension.
5 years ago
Igor Kostenko
f7fcc0df37
Fix divide by zero vs2019 compilation error #293 ( #294 )
...
* Fix divide by zero vs2019 compilation error
* undef introduced define
5 years ago
Joshua Haberman
0dc2394da5
Changes to support import into google3 ( #291 )
...
* Fixes for google3.
* Added to failure list for new failure.
* Reused existing failure list file.
* Add a ./ to assist rewriting.
5 years ago
Joshua Haberman
363e39c171
Fix for extra compiler warnings. ( #290 )
5 years ago
Joshua Haberman
b717575cef
Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors. ( #289 )
...
* Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors.
* Disable -Wshorten-32-to-64 since Kokoro is missing Clang.
* Fixed -Wextra warnings for gcc.
* Reordered UPB_UNUSED() to come after declarations.
* Added another -pedantic fix and log CC version.
* Fix compile error and conditionally run use_bazel.sh.
* Moved set -e after use_bazel.sh.
* Fixed typo in conditional.
5 years ago
Joshua Haberman
6b808a4072
Fixed all UBSan issues and added UBSan CI checks.
5 years ago
Joshua Haberman
634d37515c
Bugfix for oneofs and added line/col info to JSON.
5 years ago