grpc_byte_buffer_reader_next() copies and references the slice. This
is not always necessary since the caller will not use the slice
after destroying the byte buffer.
A prominent example is the protobuf parser, which
calls grpc_byte_buffer_reader_next() and immediately unrefs the slice
after the call. This ref() and unref() calls can be very expensive
in the hot path.
This commit introduces grpc_byte_buffer_reader_peek() which
essentialy return a pointer to the slice in the buffer, i.e.,
no copies, and no refs.
QPS of 1MiB 1 Channel callback benchmark increases by 5%.
More importantly insructions per cycle is increased by 10%.
Also add tests and benchmarks for byte_buffer_reader_peek()
This commit reaplies 509e77a5a3