Craig Tiller
8bab3e4bf4
Buffer HPACK parsing until the end of a header boundary ( #26700 )
...
HTTP2 headers are sent in (potentially) many frames, but all must be
sent sequentially with no traffic intervening.
This was not clear when I wrote the HPACK parser, and still indeed quite
contentious on the HTTP2 mailing lists.
Now that matter is well settled (years ago!) take advantage of the fact
by delaying parsing until all bytes are available.
A future change will leverage this to avoid having to store and verify
partial parse state, completely eliminating indirect calls within the
parser.
3 years ago
Craig Tiller
d3e5803cb2
Move HPACK parser to C++ ( #26689 )
...
This is a fairly low effort migration of the current codebase into a C++ class, instead of free standing C code.
It builds upon #26657 as a necessary first step.
I've tried to minimize any changes to semantics or logic in this change, except where required to get a minimal amount of encapsulation - which is the major aim of this change.
A future change in this series will buffer slices until all HPACK headers are in memory for a stream prior to decoding -- it's important to have an encapsulated API to the parser before doing so however (hence this CL).
The next change after that will be an almost complete rewrite of the parsing functionality -- since we'll have the total set of header bytes, we'll no longer need to support suspending decoding at arbitrary points. This will allow us to move to a simple recursive descent parser, eliminate a bunch of indirection in this code, and end up in a much more malleable place for when we start doing metadata API changes.
(we likely also end up with some good performance wins!)
3 years ago
Esun Kim
ca945a58e9
Introduced grpc_error_handle ( #25902 )
...
- Define grpc_error_handle
- Replace grpc_error* with grpc_error_handle
4 years ago
Esun Kim
62ac3f075a
Added call to grpc::testing::TestEnvironment in tests
4 years ago
Esun Kim
3a519a0b64
Replaced grpc_core New & Delete with C++ new & delete
5 years ago
Vijay Pai
406b70629a
Remove unused parameter warning (17 of 20)
5 years ago
Arjun Roy
0b06676c9e
hpack encoder optimizations.
...
Removed some cycles and branches from hpack_enc for CH2.
Specifically:
1. Pushed certain metadata key/value length checks to
prepare_application_metadata() in src/core/lib/surface/call.cc.
This means that rather than check all key/val lengths for all metadata, we only
do so for custom added user metadata. Inside CH2, we change the length checks to
debug checks so we can catch if core/filter metadata fails to pass the check.
2. Changed various asserts to debug asserts when able.
3. Refactored some of the header emission code to remove duplicated code.
4. Un-inlined some logging methods.
This results in somewhat faster hpack_encoder performance:
BM_HpackEncoderInitDestroy
222ns ± 0% 221ns ± 0% -0.29% (p=0.000 n=34+34)
BM_HpackEncoderEncodeDeadline
[framing_bytes/iter:9 header_bytes/iter:6 ] 135ns ± 1%
124ns ± 0% -8.05% (p=0.000 n=39+38)
BM_HpackEncoderEncodeHeader<EmptyBatch>/0/16384
[framing_bytes/iter:9 header_bytes/iter:0 ] 34.2ns ± 0%
34.2ns ± 0% -0.01% (p=0.014 n=34+38)
BM_HpackEncoderEncodeHeader<EmptyBatch>/1/16384
[framing_bytes/iter:9 header_bytes/iter:0 ] 34.2ns ± 0%
34.2ns ± 0% -0.04% (p=0.004 n=34+37)
BM_HpackEncoderEncodeHeader<SingleStaticElem>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.5ns ± 0%
45.9ns ± 0% -3.28% (p=0.000 n=28+38)
BM_HpackEncoderEncodeHeader<SingleInternedKeyElem>/0/16384
[framing_bytes/iter:9 header_bytes/iter:6 ] 77.0ns ± 1%
68.3ns ± 1% -11.33% (p=0.000 n=39+40)
BM_HpackEncoderEncodeHeader<SingleInternedElem>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.7ns ± 1%
45.5ns ± 0% -4.63% (p=0.000 n=39+33)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<1, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.2ns ± 0%
45.3ns ± 0% -3.96% (p=0.000 n=33+34)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<3, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.7ns ± 0%
45.6ns ± 0% -4.54% (p=0.000 n=38+40)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<10, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.7ns ± 0%
45.5ns ± 0% -4.63% (p=0.000 n=39+32)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<31, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.8ns ± 0%
45.6ns ± 1% -4.59% (p=0.000 n=38+39)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<100, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.8ns ± 0%
45.5ns ± 0% -4.64% (p=0.000 n=39+36)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<1, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.3ns ± 0%
45.3ns ± 0% -4.09% (p=0.000 n=38+36)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<3, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.8ns ± 1%
45.6ns ± 0% -4.71% (p=0.000 n=37+40)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<10, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.7ns ± 0%
45.5ns ± 0% -4.66% (p=0.000 n=39+32)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<31, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.8ns ± 1%
45.6ns ± 1% -4.62% (p=0.000 n=37+39)
BM_HpackEncoderEncodeHeader<SingleInternedBinaryElem<100, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.7ns ± 0%
45.5ns ± 0% -4.67% (p=0.000 n=38+32)
BM_HpackEncoderEncodeHeader<SingleNonInternedElem>/0/16384
[framing_bytes/iter:9 header_bytes/iter:9 ] 80.5ns ± 1%
74.7ns ± 0% -7.16% (p=0.000 n=38+35)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<1, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:12 ] 105ns ± 1%
99ns ± 0% -5.91% (p=0.000 n=38+34)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<3, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:14 ] 111ns ± 1%
106ns ± 1% -4.86% (p=0.020 n=39+2)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<10, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:23 ] 135ns ± 0%
130ns ± 0% -3.45% (p=0.020 n=35+2)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<31, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:46 ] 225ns ± 1%
223ns ± 0% -0.91% (p=0.003 n=37+2)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<100, false>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:120 ] 467ns ± 0%
472ns ± 0% +1.09% (p=0.003 n=38+2)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<1, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:12 ] 81.6ns ± 1%
74.8ns ± 0% -8.40% (p=0.000 n=37+33)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<3, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:14 ] 82.0ns ± 1%
74.8ns ± 0% -8.80% (p=0.000 n=37+32)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<10, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:21 ] 82.1ns ± 1%
74.9ns ± 0% -8.86% (p=0.000 n=35+34)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<31, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:42 ] 97.6ns ± 2%
91.8ns ± 0% -5.95% (p=0.000 n=35+27)
BM_HpackEncoderEncodeHeader<SingleNonInternedBinaryElem<100, true>>/0/16384
[framing_bytes/iter:9 header_bytes/iter:111 ] 97.2ns ± 1%
91.2ns ± 2% -6.19% (p=0.000 n=37+38)
BM_HpackEncoderEncodeHeader<SingleNonInternedElem>/0/1
[framing_bytes/iter:54 header_bytes/iter:9 ] 230ns ± 0%
221ns ± 0% -3.91% (p=0.000 n=38+37)
BM_HpackEncoderEncodeHeader<MoreRepresentativeClientInitialMetadata>/0/16384
[framing_bytes/iter:9 header_bytes/iter:16 ] 206ns ± 2%
170ns ± 1% -17.51% (p=0.000 n=39+39)
BM_HpackEncoderEncodeHeader<RepresentativeServerInitialMetadata>/0/16384
[framing_bytes/iter:9 header_bytes/iter:3 ] 66.4ns ± 2%
62.5ns ± 1% -5.85% (p=0.000 n=34+39)
BM_HpackEncoderEncodeHeader<RepresentativeServerTrailingMetadata>/1/16384
[framing_bytes/iter:9 header_bytes/iter:1 ] 47.5ns ± 0%
45.9ns ± 1% -3.29% (p=0.000 n=26+38)
5 years ago
Vijay Pai
1077b3435c
Use range-based for on state rather than state.KeepRunning when possible
5 years ago
Arjun Roy
b46e3668d3
s/branch/tail_call/ for CH2 on_hdr().
...
on_hdr() checks if a void-return function pointer is null before jumping to it.
If it is null, it returns an error; else it executes that function and returns
success.
This change converts the void-returning function to one that returns a
grpc_error* and thus saves a branch in on_hdr() (since we're branching once by
following the function pointer anyways, we're effectively coalescing these two
branches).
5 years ago
Hope Casey-Allen
59564ebd96
Fix warnings to unblock gcc8 support
5 years ago
Arjun Roy
b1d73a01f1
Removed duplicate static table from hpack table. Removed an or instruction for
...
every usage of static grpc metadata. Inlined hpack table lookups for static
metadata.
This leads to faster hpack parser creation:
BM_HpackParserInitDestroy 5.32µs ± 1% 0.06µs ± 1% -98.91% (p=0.000 n=18+19)
And slightly faster parsing:
BM_HpackParserParseHeader<RepresentativeClientInitialMetadata, OnInitialHeader>
456ns ± 1% 435ns ± 1% -4.74% (p=0.000 n=18+19)
BM_HpackParserParseHeader<MoreRepresentativeClientInitialMetadata,
OnInitialHeader>
1.06µs ± 2% 1.04µs ± 2% -1.82% (p=0.000 n=19+20)
It also yields a slight (0.5 - 1.0 microsecond) reduction in CPU time for
fullstack unary pingpong:
BM_UnaryPingPong<TCP, NoOpMutator, NoOpMutator>/0/512
[polls/iter:3.0001 ] 23.9µs ± 2%
23.0µs ± 1% -3.63% (p=0.002 n=6+6)
BM_UnaryPingPong<TCP, NoOpMutator, NoOpMutator>/0/32768
[polls/iter:3.00015 ] 35.1µs ± 1%
34.2µs ± 1% -2.57% (p=0.036 n=5+3)
BM_UnaryPingPong<MinTCP, NoOpMutator, NoOpMutator>/8/0
[polls/iter:3.00011 ] 21.7µs ± 3%
21.2µs ± 2% -2.44% (p=0.017 n=6+5)
5 years ago
Esun Kim
e18ed03c04
Made gRPC inialized after entering main function in microbenchmarks.
6 years ago
Arjun Roy
8ce42f67b2
Shrink arena size by 40 bytes and add additional
...
alignment options (for cache-alignment).
We shrink by:
1) Removing an unnecessary zone pointer.
2) Replacing gpr_mu (40 bytes when using pthread_mutex_t) with
std::atomic_flag.
We also header-inline the fastpath alloc (ie. when not doing a zone
alloc) and move the malloc() for a zone alloc outside of the mutex
critical zone, which allows us to replace the mutex with a spinlock.
We also cache-align created arenas.
6 years ago
Soheil Hassas Yeganeh
48e4a81b05
Remeve memset(0) from arena allocated memory.
...
Callers are updated to properly initialize the memory.
This behavior can be overridden using GRPC_ARENA_INIT_STRATEGY
environment variable.
6 years ago
Hope Casey-Allen
4c6e7ce15d
Destroy metadata buffer at end of benchmark loop
6 years ago
Hope Casey-Allen
d44feec92f
Reassign arena pointer instead of stomping on memory
6 years ago
Hope Casey-Allen
4b721fbde0
Destroy arena at end of benchmark to not leak memory
6 years ago
Hope Casey-Allen
29d9489ea9
Increase initial arena size to be more representative of real workload scenario and increase frequency of recreating the arena to avoid oom
6 years ago
Hope Casey-Allen
91727bd015
Move arena create outside of benchmark, format, and typo fix
6 years ago
Hope Casey-Allen
967bbcd5d3
Fixing benchmark name and adding a new one
6 years ago
Noah Eisen
58e0cbf9fb
Enable the performance-* clang-tidy checks
7 years ago
ncteisen
40ec89ff67
Support microbenchmarks internally
7 years ago
Vijay Pai
2f4161c210
Use stack frame size limits for consistency with internal builds
7 years ago
Noah Eisen
4d20a66685
Run clang fmt
7 years ago
Noah Eisen
be82e64b3d
Autofix c casts to c++ casts
7 years ago
Muxi Yan
38fcd0c6c3
clang-format
7 years ago
Yash Tibrewal
8cf1470a51
Revert "Revert "All instances of exec_ctx being passed around in src/core removed""
7 years ago
Yash Tibrewal
ad4d2dde00
Revert "All instances of exec_ctx being passed around in src/core removed"
7 years ago
Yash Tibrewal
c354269ba7
Remove _ prefixed variable names
7 years ago
Yash Tibrewal
6c26b16fe0
Move ExecCtx to grpc_core namespace. Make exec_ctx a private static in ExecCtx and some minor changes
7 years ago
Yash Tibrewal
75122c2357
Address some PR comments
7 years ago
Craig Tiller
4ac2b8e585
Enable clang-tidy as a sanity check, fix up all known failures
7 years ago
Yash Tibrewal
0032548674
Correction to closure.cc,bm_chttp2_hpack and few more
7 years ago
Yash Tibrewal
3150744c71
Removing more exec_ctx instances
7 years ago
Craig Tiller
baa14a975e
Update clang-format to 5.0
7 years ago
Yash Tibrewal
39aed1ae8b
Remove unnecessary extern Cs
7 years ago
ncteisen
c296e82e11
clang fmt
7 years ago
ncteisen
9e3eedb6af
Remove old header benchmark
7 years ago
ncteisen
6bf4bcef04
Fix bm_diff
7 years ago
yang-g
83085aa74f
Add a microbm, seeing 195ns with current impl and 162ns with new impl
7 years ago
Yash Tibrewal
0ee7574732
Removing instances of exec_ctx being passed around in functions in
...
src/core. exec_ctx is now a thread_local pointer of type ExecCtx instead of
grpc_exec_ctx which is initialized whenever ExecCtx is instantiated. ExecCtx
also keeps track of the previous exec_ctx so that nesting of exec_ctx is
allowed. This means that there is only one exec_ctx being used at any
time. Also, grpc_exec_ctx_finish is called in the destructor of the
object, and the previous exec_ctx is restored to avoid breaking current
functionality. The code still explicitly calls grpc_exec_ctx_finish
because removing all such instances causes the code to break.
7 years ago
yang-g
c010d1d18a
Update benchmark according to new encoding method
7 years ago
yang-g
c94c7cc5b5
restore existing fixtures
7 years ago
yang-g
377636f4d2
Make hpack micro bm more representative
7 years ago
Mark D. Roth
bd3b93b4b5
Add support for Trailers-Only responses.
...
- When receiving a Trailers-Only response, return the metadata as
trailing metadata instead of initial metadata.
- Send Trailers-Only response when we have no non-default initial metadata,
no message to send, and trailing metadata to send.
8 years ago
Jan Tattermusch
7897ae9308
auto-fix most of licenses
8 years ago
Craig Tiller
85a747ed06
better representative output
8 years ago
Craig Tiller
e76a0ecc03
Update hpack benchmarks for true binary
8 years ago
Craig Tiller
eb0e34f736
Convert everything to new encode API
8 years ago
Craig Tiller
83643bd5cb
Fix build on mac
8 years ago