* Buffer HPACK parsing until the end of a header boundary
HTTP2 headers are sent in (potentially) many frames, but all must be
sent sequentially with no traffic intervening.
This was not clear when I wrote the HPACK parser, and still indeed quite
contentious on the HTTP2 mailing lists.
Now that matter is well settled (years ago!) take advantage of the fact
by delaying parsing until all bytes are available.
A future change will leverage this to avoid having to store and verify
partial parse state, completely eliminating indirect calls within the
parser.
* maybe fixes
* xx
* fix boundary detection
* clang-format
* Revert "xx"
This reverts commit 258d712ed3.
* fix tests
* add missed check
* fixes
* fix
* update tests
* fix benchmark
* properly unref
* optimize final slice refcounting
* cleanup bm_chttp2_hpack
* start
* new parser progress
* refinement
* get it compiling
* bug-fix
* build files
* clang-tidy
* fixes
* fixes
* fixes
* fix-leaks
* clang-tidy
* comments
* fix merge error
* Revert "Buffer HPACK parsing until the end of a header boundary (#26700)"
This reverts commit 8bab3e4bf4.
* streaming hpack parser start
* streaming parser
* clang-format
* Rework HPackTable into C++
* clang-tidy
* fix merge
* actually set the size of the entries array
* better
HTTP2 headers are sent in (potentially) many frames, but all must be
sent sequentially with no traffic intervening.
This was not clear when I wrote the HPACK parser, and still indeed quite
contentious on the HTTP2 mailing lists.
Now that matter is well settled (years ago!) take advantage of the fact
by delaying parsing until all bytes are available.
A future change will leverage this to avoid having to store and verify
partial parse state, completely eliminating indirect calls within the
parser.
This is a fairly low effort migration of the current codebase into a C++ class, instead of free standing C code.
It builds upon #26657 as a necessary first step.
I've tried to minimize any changes to semantics or logic in this change, except where required to get a minimal amount of encapsulation - which is the major aim of this change.
A future change in this series will buffer slices until all HPACK headers are in memory for a stream prior to decoding -- it's important to have an encapsulated API to the parser before doing so however (hence this CL).
The next change after that will be an almost complete rewrite of the parsing functionality -- since we'll have the total set of header bytes, we'll no longer need to support suspending decoding at arbitrary points. This will allow us to move to a simple recursive descent parser, eliminate a bunch of indirection in this code, and end up in a much more malleable place for when we start doing metadata API changes.
(we likely also end up with some good performance wins!)
on_hdr() checks if a void-return function pointer is null before jumping to it.
If it is null, it returns an error; else it executes that function and returns
success.
This change converts the void-returning function to one that returns a
grpc_error* and thus saves a branch in on_hdr() (since we're branching once by
following the function pointer anyways, we're effectively coalescing these two
branches).
alignment options (for cache-alignment).
We shrink by:
1) Removing an unnecessary zone pointer.
2) Replacing gpr_mu (40 bytes when using pthread_mutex_t) with
std::atomic_flag.
We also header-inline the fastpath alloc (ie. when not doing a zone
alloc) and move the malloc() for a zone alloc outside of the mutex
critical zone, which allows us to replace the mutex with a spinlock.
We also cache-align created arenas.
src/core. exec_ctx is now a thread_local pointer of type ExecCtx instead of
grpc_exec_ctx which is initialized whenever ExecCtx is instantiated. ExecCtx
also keeps track of the previous exec_ctx so that nesting of exec_ctx is
allowed. This means that there is only one exec_ctx being used at any
time. Also, grpc_exec_ctx_finish is called in the destructor of the
object, and the previous exec_ctx is restored to avoid breaking current
functionality. The code still explicitly calls grpc_exec_ctx_finish
because removing all such instances causes the code to break.