By popular demand, we'll now be offering separate py_grpc_library and
py_proto_library targets sharing the same interface as within google3.
This change necessitated some modifications to how we pull in our own
Python-level dependencies and how we make those available to those
pulling in our project via Bazel.
There is now a grpc_python_deps() Bazel workspace rule that pulls in the
appropriate dependencies, which should be called from the client
project's WORKSPACE file. A test has been added to the bazel/test/
directory to verify that this behavior works as intended.
It's worth noting that the protobuf repository's usage of Starlark
bind() caused a great deal of trouble in ensuring that we could also
pull in six.
This change also required a change in the way generated proto code is
imported in the channelz and health-check modules, as well as in their
associated tests. We were importing them two different ways, each
relative. This resulted in two different module objects being imported
into the process, which were incompatible. I am not sure exactly what
caused this behavior to begin, as this should have been possible before
this PR. As a workaround, I am simply trying two different absolute
imports and using the one that works. This should function both inside
and outside of Bazel environments.
1) Statically pre-compute static slice indices to save some ALU ops when linking
batched metadata.
2) Change some asserts to debug_asserts since they can provably not be triggered
with the current implementation.
3) Save some slice comparison cycles inside CH2 parsing.
Several asserts in grpc_chttp2_stream_map_find() can be converted to debug
asserts. This PR also templatizes the internal find() method to have it be
strict in the delete case (which saves some branches).
In several cases, we create grpc mdelem structures using known-static
metadata inputs. Furthermore, in several cases we create a slice on
the heap (e.g. grpc_slice_from_copied_buffer) where we know we are
transferring refcount ownership. In several cases, then, we can:
1) Avoid unnecessary ref/unref operations that are no-ops (for static
slices) or superfluous (if we're transferring ownership).
2) Avoid unnecessarily comprehensive calls to grpc_slice_eq (since
they'd only be called with static or interned slice arguments,
which by construction would have equal refcounts if they were
in fact equal.
3) Avoid unnecessary checks to see if a slice is interned (when we
know that they are).
To avoid polluting the internal API, we introduce the notion of
strongly-typed grpc_slice objects. We draw a distinction between
Internal (interned and static-storage) slices and Extern (inline and
non-statically allocated). We introduce overloads to
grpc_mdelem_create() and grpc_mdelem_from_slices() for the fastpath
cases identified above based on these slice types.
From the programmer's point of view, though, nothing changes - they
need only use grpc_mdelem_create() and grpc_mdelem_from_slices() as
before, and the appropriate fastpath will be picked based on type
inference. If no special knowledge exists for the slice type (i.e. we
pass in generic grpc_slice objects), the slowpath method will still
always return correct behaviour.
This is good for:
- Roughly 1-3% reduction in CPU time for several unary/streaming
ping pong fullstack microbenchmarks.
- Reduction of about 15-20% in CPU time for some hpack parser
microbenchmarks.
- 10-12% reduction of CPU time for metadata microbenchmarks involving
interned slice comparisons.
Add a lazy initialization path, so that we do not have to
initialize/access the state that's only used when we compress.
Move the frequently access fields to the front so we are more
cache friendly.