protobuf

Commit Graph

Author	SHA1	Message	Date
Joshua Haberman	2339fc779c	Updated obsolete comment.	4 years ago
Joshua Haberman	b393849bbd	Updated obsolete comment.	4 years ago
Joshua Haberman	ebe53f8590	Fixed compile error.	4 years ago
Joshua Haberman	b37f82b58b	Fixed compile error.	4 years ago
Joshua Haberman	71749b7caf	Implemented inline array allocation, and moved type->lg2 map to reflection.	4 years ago
Joshua Haberman	9557b97acc	Implemented inline array allocation, and moved type->lg2 map to reflection.	4 years ago
Joshua Haberman	b58d2a0ee6	Shrink overhead of message representation.	4 years ago
Joshua Haberman	0bf063a2ca	Shrink overhead of message representation.	4 years ago
Joshua Haberman	d87ceeacab	Shave off one more store.	4 years ago
Joshua Haberman	ddc52ab9d6	Shave off one more store.	4 years ago
Joshua Haberman	c25d895adf	Shrunk the arena state that needs to be synced.	4 years ago
Joshua Haberman	7f67f68c1c	Shrunk the arena state that needs to be synced.	4 years ago
Joshua Haberman	ff40dd6ea9	Added new internal header.	4 years ago
Joshua Haberman	85a43e5461	Added new internal header.	4 years ago
Gerben Stavenga	36662b3735	Refactor some code. I extracted some common code from all message field parsers, to a tail recursive function. Removed the varint jmp table for a simple varint parse loop, that removes the stack frames. Also careful with not losing information in repeated message tag check. When written mindful the checks and loads that happen can be reused for tag dispatch if not the expected tag.	4 years ago
Joshua Haberman	cbcd635917	Fixed memory leak.	4 years ago
Joshua Haberman	bcbcdadbd2	Fixed memory leak.	4 years ago
Joshua Haberman	746f64692c	Moved arena inline for decoder.	4 years ago
Joshua Haberman	7363b91ac3	Moved arena inline for decoder.	4 years ago
Joshua Haberman	b8ef1dcc57	Removed C++-style comments.	4 years ago
Joshua Haberman	575acd85bd	Re-added const for all of the pointer wrapper types.	4 years ago
Joshua Haberman	5aa5b77b41	Added simple offset-based accessors for defs, and deprecated old iterators.	4 years ago
Joshua Haberman	9938cf8f27	Put submsg_index directly in table data. Drop oneof support for now to focus.	4 years ago
Joshua Haberman	d87179501d	Another build fix.	4 years ago
Joshua Haberman	89bd8b87e1	Fixed a few more C89 compat issues.	4 years ago
Joshua Haberman	64d293894a	Fixed bug introduced by last optimization.	4 years ago
Joshua Haberman	ff957b996c	Fixed C89 compat issues.	4 years ago
Joshua Haberman	537b6f42c2	A few updates to the benchamrk and minor implementation changes.	4 years ago
Joshua Haberman	0dcc5641eb	Replicated dispatch and implemeted array resizing logic. Up to 2.67GB/s.	4 years ago
Joshua Haberman	526e430794	I think this may have reached the optimization limit. ------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------- BM_ArenaOneAlloc 21 ns 21 ns 32994231 BM_ArenaInitialBlockOneAlloc 6 ns 6 ns 116318005 BM_ParseDescriptorNoHeap 3028 ns 3028 ns 231138 2.34354GB/s BM_ParseDescriptor 3557 ns 3557 ns 196583 1.99498GB/s BM_ParseDescriptorProto2NoArena 33228 ns 33226 ns 21196 218.688MB/s BM_ParseDescriptorProto2WithArena 22863 ns 22861 ns 30666 317.831MB/s BM_SerializeDescriptorProto2 5444 ns 5444 ns 127368 1.30348GB/s BM_SerializeDescriptor 12509 ns 12508 ns 55816 580.914MB/s $ perf stat bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap 2020-10-08 14:07:06 Running bazel-bin/benchmark Run on (72 X 3700 MHz CPU s) CPU Caches: L1 Data 32K (x36) L1 Instruction 32K (x36) L2 Unified 1024K (x36) L3 Unified 25344K (x2) ---------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------- BM_ParseDescriptorNoHeap 3071 ns 3071 ns 227743 2.31094GB/s Performance counter stats for 'bazel-bin/benchmark --benchmark_filter=BM_ParseDescriptorNoHeap': 1,050.22 msec task-clock # 0.978 CPUs utilized 4 context-switches # 0.004 K/sec 0 cpu-migrations # 0.000 K/sec 179 page-faults # 0.170 K/sec 3,875,796,334 cycles # 3.690 GHz 13,282,835,967 instructions # 3.43 insn per cycle 2,887,725,848 branches # 2749.627 M/sec 8,324,912 branch-misses # 0.29% of all branches 1.073924364 seconds time elapsed 1.042806000 seconds user 0.008021000 seconds sys Profile: 23.96% benchmark benchmark [.] upb_prm_1bt_max192b 22.44% benchmark benchmark [.] fastdecode_dispatch 18.96% benchmark benchmark [.] upb_pss_1bt 14.20% benchmark benchmark [.] upb_psv4_1bt 8.33% benchmark benchmark [.] upb_prm_1bt_max64b 6.66% benchmark benchmark [.] upb_prm_1bt_max128b 1.29% benchmark benchmark [.] upb_psm_1bt_max64b 0.77% benchmark benchmark [.] fastdecode_generic 0.55% benchmark [kernel.kallsyms] [k] smp_call_function_single 0.42% benchmark [kernel.kallsyms] [k] _raw_spin_lock_irqsave 0.42% benchmark benchmark [.] upb_psm_1bt_max256b 0.31% benchmark benchmark [.] upb_psb1_1bt 0.21% benchmark benchmark [.] upb_plv4_5bv 0.14% benchmark benchmark [.] upb_psb1_2bt 0.12% benchmark benchmark [.] decode_longvarint64 0.08% benchmark [kernel.kallsyms] [k] vsnprintf 0.07% benchmark [kernel.kallsyms] [k] _raw_spin_lock 0.07% benchmark benchmark [.] _upb_msg_new 0.06% benchmark ld-2.31.so [.] check_match	4 years ago
Joshua Haberman	4c65b25daf	Handle long varints, now 2GB/s!	4 years ago
Joshua Haberman	e39ec95ca2	Hoisted updates to limits and depth out of the loop.	4 years ago
Joshua Haberman	388b6f64eb	A small optimization: don't increment array length every iteration.	4 years ago
Joshua Haberman	9e5c5ce089	Optimized memset() with cutoff and fixed group & unknown message bugs.	4 years ago
Joshua Haberman	8dd7b5a2ca	A bunch more optimization.	4 years ago
Joshua Haberman	405e7934b1	Handle 2-byte submessage lengths.	4 years ago
Joshua Haberman	88b1ec7784	Table-driven supports repeated sub-messages.	4 years ago
Joshua Haberman	f173642db4	Handle non-repeated submessages.	4 years ago
Joshua Haberman	7ec2c52346	Donate/steal from arena to accelerate decoding.	4 years ago
Joshua Haberman	fac992db83	Cleanup for showing.	4 years ago
Joshua Haberman	3937874a85	We have a properly structured algorithm, but perf regresses by 20%.	4 years ago
Joshua Haberman	438ecaeb5a	Give all field parsers a generic table entry.	4 years ago
Joshua Haberman	383ae5293e	WIP.	4 years ago
Joshua Haberman	26abaa2345	WIP.	4 years ago
Joshua Haberman	34b98bc030	Avoid passing too many params to fallback.	4 years ago
Joshua Haberman	763a3f6293	WIP.	4 years ago
Joshua Haberman	a202ce9629	Add UPB_FORCEINLINE for varint32 decoding. This speeds up the decoder by >20% and also reduces code size slightly! name old time/op new time/op delta ArenaOneAlloc 20.4ns ± 0% 20.2ns ± 0% -1.10% (p=0.000 n=12+11) ArenaInitialBlockOneAlloc 5.25ns ± 0% 5.25ns ± 0% ~ (p=0.786 n=11+12) ParseDescriptorNoHeap 17.1µs ± 0% 13.1µs ± 0% -23.29% (p=0.000 n=11+12) ParseDescriptor 17.4µs ± 1% 13.5µs ± 1% -22.51% (p=0.000 n=12+12) SerializeDescriptor 10.7µs ± 0% 10.9µs ± 0% +1.95% (p=0.000 n=12+12) FILE SIZE VM SIZE -------------- -------------- +2.7% +16 +2.7% +16 [LOAD #2 [RX]] +0.5% +16 [ = ] 0 [Unmapped] -1.4% -72 -0.7% -32 upb/decode.c +3.1% +98 +3.1% +98 decode_msg [DEL] -170 [DEL] -130 decode_varint32 -0.0% -40 -0.0% -16 TOTAL	4 years ago
Joshua Haberman	5741eb9ad7	Expanded benchmarking script and added one size opt to the encoder.	4 years ago
Joshua Haberman	0135399e60	Fixed bug introduced in refactoring.	4 years ago
Joshua Haberman	df3438222b	Notated impossible branch as unreachable.	4 years ago

1 2 3 4 5 ...

718 Commits (baa7fe7473314002bf746d9d10ddf46c8cb86853)