From 54a3b642466faf6f71578c44615d31af861b92ef Mon Sep 17 00:00:00 2001 From: Arjun Roy Date: Wed, 10 Jul 2019 15:49:37 -0700 Subject: [PATCH] Pull() eats slice from sbuf wrapped byte buffer. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit At present, a commonly used byte buffer wraps an underlying slice buffer. This is used by grpc_call; it passes a byte buffer for send stream ops through the filter stack and down to the transport. Pull() is a one-way operation; given a slice-buffer backed byte-buffer, it starts with a cursor at index=0 and increments it per pull. There is no way for a cursor to be rewinded. Originally Pull() would ref the slice and return a copy; the slice remains in the underlying buffer and is eventually cleaned up (unref'd). Thus, assuming a slice buffer of size N and assuming we pulled all N slices then we have a total of 2N ref/unref pairs. Since Pull() is essentially acting as a consume operation here, and since it is a one-way operation, we simply transfer the ref rather than create a new ref. This cuts down atomic ops. from 2N to N. This reduces streaming latency slightly for streaming fullstack microbenchmarks: BM_StreamingPingPong/0/1 [polls/iter:0 ] 8.87µs ± 1% 8.91µs ± 1% +0.45% (p=0.002 n=20+20) BM_StreamingPingPong/0/2 [polls/iter:0 ] 11.7µs ± 1% 11.7µs ± 1% +0.34% (p=0.002 n=19+19) BM_StreamingPingPong/8/1 [polls/iter:0 ] 9.11µs ± 1% 9.14µs ± 1% +0.29% (p=0.016 n=19+20) BM_StreamingPingPong/64/1 [polls/iter:0 ] 9.29µs ± 1% 9.27µs ± 1% -0.23% (p=0.029 n=18+19) BM_StreamingPingPong/512/2 [polls/iter:0 ] 13.0µs ± 1% 12.9µs ± 1% -0.52% (p=0.000 n=20+19) BM_StreamingPingPong/4096/1 [polls/iter:0 ] 11.8µs ± 1% 11.8µs ± 1% -0.41% (p=0.007 n=20+20) BM_StreamingPingPong/4096/2 [polls/iter:0 ] 17.4µs ± 1% 17.2µs ± 1% -0.73% (p=0.000 n=19+19) BM_StreamingPingPongMsgs/1 [polls/iter:0 ] 2.87µs ± 1% 2.84µs ± 1% -1.18% (p=0.000 n=19+18) BM_StreamingPingPongMsgs/8 [polls/iter:0 ] 2.85µs ± 1% 2.84µs ± 0% -0.38% (p=0.022 n=20+17) BM_StreamingPingPongMsgs/64 [polls/iter:0 ] 3.00µs ± 1% 2.95µs ± 1% -1.64% (p=0.000 n=19+19) BM_StreamingPingPongMsgs/512 [polls/iter:0 ] 3.32µs ± 1% 3.23µs ± 1% -2.62% (p=0.000 n=20+20) BM_StreamingPingPongMsgs/4096 [polls/iter:0 ] 5.38µs ± 2% 5.30µs ± 1% -1.42% (p=0.000 n=19+20) BM_StreamingPingPongMsgs/32768 [polls/iter:0 ] 24.1µs ± 1% 24.0µs ± 0% -0.47% (p=0.000 n=19+18) BM_StreamingPingPong/0/0 [polls/iter:0 ] 5.71µs ± 1% 5.69µs ± 1% -0.36% (p=0.010 n=19+19) BM_StreamingPingPong/0/1 [polls/iter:0 ] 8.64µs ± 1% 8.68µs ± 1% +0.38% (p=0.004 n=20+18) BM_StreamingPingPong/1/2 [polls/iter:0 ] 11.9µs ± 1% 11.8µs ± 1% -0.36% (p=0.003 n=20+18) BM_StreamingPingPong/64/1 [polls/iter:0 ] 9.09µs ± 1% 9.06µs ± 1% -0.32% (p=0.046 n=20+20) BM_StreamingPingPong/64/2 [polls/iter:0 ] 12.2µs ± 0% 12.1µs ± 1% -0.35% (p=0.001 n=19+20) BM_StreamingPingPong/512/2 [polls/iter:0 ] 12.8µs ± 1% 12.7µs ± 1% -0.46% (p=0.000 n=19+19) BM_StreamingPingPong/4096/1 [polls/iter:0 ] 11.7µs ± 2% 11.6µs ± 1% -0.58% (p=0.004 n=20+20) BM_StreamingPingPong/4096/2 [polls/iter:0 ] 17.1µs ± 1% 17.0µs ± 1% -0.55% (p=0.010 n=20+20) BM_StreamingPingPong/32768/1 [polls/iter:0 ] 30.5µs ± 1% 30.6µs ± 0% +0.27% (p=0.003 n=20+19) BM_StreamingPingPong/262144/2 [polls/iter:0 ] 363µs ± 1% 362µs ± 1% -0.22% (p=0.049 n=20+20) BM_StreamingPingPongMsgs/134217728 [polls/iter:4 writes/iter:8.5 ] 266ms ± 0% 263ms ± 1% -1.23% (p=0.029 n=4+4) BM_StreamingPingPongMsgs/64 [polls/iter:0 ] 2.95µs ± 1% 2.93µs ± 1% -0.97% (p=0.000 n=19+20) BM_StreamingPingPongMsgs/512 [polls/iter:0 ] 3.27µs ± 1% 3.19µs ± 1% -2.32% (p=0.000 n=20+20) BM_StreamingPingPongMsgs/4096 [polls/iter:0 ] 5.33µs ± 2% 5.26µs ± 1% -1.30% (p=0.000 n=20+20) BM_StreamingPingPongMsgs/32768 [polls/iter:0 ] 24.1µs ± 1% 24.0µs ± 0% -0.35% (p=0.001 n=19+20) BM_StreamingPingPongWithCoalescingApi/0/2/1 [polls/iter:0 ] 10.6µs ± 1% 10.6µs ± 1% +0.45% (p=0.014 n=20+20) BM_StreamingPingPongWithCoalescingApi/1/2/0 [polls/iter:0 ] 11.3µs ± 1% 11.3µs ± 1% -0.39% (p=0.046 n=20+20) BM_StreamingPingPongWithCoalescingApi/1/1/1 [polls/iter:0 ] 7.92µs ± 1% 7.88µs ± 1% -0.46% (p=0.012 n=19+18) BM_StreamingPingPongWithCoalescingApi/1/2/1 [polls/iter:0 ] 11.0µs ± 1% 10.9µs ± 1% -0.48% (p=0.001 n=18+20) BM_StreamingPingPongWithCoalescingApi/8/1/0 [polls/iter:0 ] 8.33µs ± 1% 8.30µs ± 1% -0.29% (p=0.025 n=19+19) BM_StreamingPingPongWithCoalescingApi/8/2/0 [polls/iter:0 ] 11.3µs ± 1% 11.3µs ± 1% -0.34% (p=0.011 n=19+19) BM_StreamingPingPongWithCoalescingApi/64/1/0 [polls/iter:0 ] 8.49µs ± 1% 8.41µs ± 1% -1.01% (p=0.000 n=20+20) BM_StreamingPingPongWithCoalescingApi/64/2/0 [polls/iter:0 ] 11.6µs ± 1% 11.5µs ± 1% -0.80% (p=0.000 n=19+19) BM_StreamingPingPongWithCoalescingApi/64/1/1 [polls/iter:0 ] 8.06µs ± 1% 8.02µs ± 1% -0.51% (p=0.016 n=20+19) BM_StreamingPingPongWithCoalescingApi/64/2/1 [polls/iter:0 ] 11.2µs ± 1% 11.1µs ± 1% -0.69% (p=0.000 n=19+18) BM_StreamingPingPongWithCoalescingApi/512/1/0 [polls/iter:0 ] 8.83µs ± 1% 8.71µs ± 1% -1.34% (p=0.000 n=20+19) BM_StreamingPingPongWithCoalescingApi/512/2/0 [polls/iter:0 ] 12.2µs ± 1% 12.1µs ± 1% -1.35% (p=0.000 n=19+17) BM_StreamingPingPongWithCoalescingApi/512/1/1 [polls/iter:0 ] 8.40µs ± 1% 8.32µs ± 1% -0.92% (p=0.000 n=20+20) BM_StreamingPingPongWithCoalescingApi/512/2/1 [polls/iter:0 ] 11.9µs ± 1% 11.7µs ± 1% -1.04% (p=0.000 n=19+20) BM_StreamingPingPongWithCoalescingApi/4096/1/0 [polls/iter:0 ] 11.1µs ± 1% 11.0µs ± 1% -0.53% (p=0.002 n=20+19) BM_StreamingPingPongWithCoalescingApi/4096/2/0 [polls/iter:0 ] 16.6µs ± 1% 16.5µs ± 2% -0.47% (p=0.023 n=20+20) BM_StreamingPingPongWithCoalescingApi/32768/1/1 [polls/iter:0 ] 29.6µs ± 1% 29.7µs ± 1% +0.36% (p=0.000 n=20+20) BM_StreamingPingPongWithCoalescingApi/2097152/1/0 [polls/iter:0 ] 1.59ms ± 9% 1.48ms ± 1% -7.02% (p=0.001 n=20+16) BM_StreamingPingPongWithCoalescingApi/2097152/2/0 [polls/iter:0 ] 3.17ms ± 9% 2.95ms ± 0% -6.86% (p=0.015 n=20+16) BM_StreamingPingPongWithCoalescingApi/2097152/1/1 [polls/iter:0 ] 1.59ms ± 9% 1.48ms ± 0% -7.01% (p=0.000 n=20+16) BM_StreamingPingPongWithCoalescingApi/2097152/2/1 [polls/iter:0 ] 3.17ms ± 9% 2.95ms ± 0% -6.91% (p=0.004 n=20+16) BM_StreamingPingPongWithCoalescingApi/134217728/2/1 [polls/iter:0 ] 512ms ± 2% 509ms ± 2% -0.74% (p=0.040 n=20+20) BM_StreamingPingPongWithCoalescingApi/0/1/0 [polls/iter:0 ] 8.07µs ± 1% 8.02µs ± 1% -0.62% (p=0.001 n=20+20) BM_StreamingPingPongWithCoalescingApi/0/2/0 [polls/iter:0 ] 10.9µs ± 1% 10.9µs ± 1% +0.30% (p=0.014 n=20+20) BM_StreamingPingPongWithCoalescingApi/1/1/0 [polls/iter:0 ] 8.27µs ± 1% 8.24µs ± 1% -0.40% (p=0.040 n=20+20) BM_StreamingPingPongWithCoalescingApi/1/1/1 [polls/iter:0 ] 7.86µs ± 1% 7.83µs ± 1% -0.36% (p=0.041 n=20+19) BM_StreamingPingPongWithCoalescingApi/1/2/1 [polls/iter:0 ] 11.0µs ± 1% 10.9µs ± 1% -0.50% (p=0.000 n=18+19) BM_StreamingPingPongWithCoalescingApi/8/2/1 [polls/iter:0 ] 10.9µs ± 1% 10.9µs ± 1% +0.28% (p=0.005 n=17+19) BM_StreamingPingPongWithCoalescingApi/64/1/0 [polls/iter:0 ] 8.43µs ± 1% 8.40µs ± 1% -0.40% (p=0.012 n=20+18) BM_StreamingPingPongWithCoalescingApi/64/2/1 [polls/iter:0 ] 11.2µs ± 1% 11.1µs ± 1% -0.42% (p=0.019 n=20+18) BM_StreamingPingPongWithCoalescingApi/512/1/0 [polls/iter:0 ] 8.77µs ± 1% 8.70µs ± 1% -0.91% (p=0.000 n=20+20) BM_StreamingPingPongWithCoalescingApi/512/2/0 [polls/iter:0 ] 12.2µs ± 1% 12.1µs ± 1% -0.71% (p=0.000 n=20+20) BM_StreamingPingPongWithCoalescingApi/512/2/1 [polls/iter:0 ] 11.8µs ± 1% 11.7µs ± 1% -0.78% (p=0.000 n=20+20) BM_StreamingPingPongWithCoalescingApi/4096/1/0 [polls/iter:0 ] 11.0µs ± 1% 11.0µs ± 1% -0.42% (p=0.030 n=20+20) BM_StreamingPingPongWithCoalescingApi/32768/1/1 [polls/iter:0 ] 29.4µs ± 1% 29.5µs ± 1% +0.35% (p=0.003 n=20+20) BM_StreamingPingPongWithCoalescingApi/2097152/1/0 [polls/iter:0 ] 1.60ms ± 8% 1.48ms ± 1% -7.63% (p=0.002 n=20+16) BM_StreamingPingPongWithCoalescingApi/2097152/2/0 [polls/iter:0 ] 3.20ms ± 8% 2.96ms ± 1% -7.49% (p=0.020 n=20+16) BM_StreamingPingPongWithCoalescingApi/2097152/1/1 [polls/iter:0 ] 1.60ms ± 8% 1.48ms ± 1% -7.68% (p=0.000 n=20+16) BM_StreamingPingPongWithCoalescingApi/2097152/2/1 [polls/iter:0 ] 3.20ms ± 9% 2.95ms ± 1% -7.76% (p=0.001 n=20+16) BM_StreamingPingPongWithCoalescingApi/134217728/1/0 [polls/iter:5.8 writes/iter:9.8 ] 255ms ± 2% 248ms ± 1% -2.59% (p=0.017 n=3+7) BM_StreamingPingPongWithCoalescingApi/134217728/2/1 [polls/iter:9.5 writes/iter:18.5 ] 523ms ± 1% 512ms ± 2% -2.09% (p=0.016 n=5+5) --- src/core/lib/transport/byte_stream.cc | 8 +++----- src/core/lib/transport/byte_stream.h | 3 +-- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/src/core/lib/transport/byte_stream.cc b/src/core/lib/transport/byte_stream.cc index 16b85ca0db0..1dd234c3761 100644 --- a/src/core/lib/transport/byte_stream.cc +++ b/src/core/lib/transport/byte_stream.cc @@ -55,17 +55,15 @@ void SliceBufferByteStream::Orphan() { bool SliceBufferByteStream::Next(size_t max_size_hint, grpc_closure* on_complete) { - GPR_ASSERT(cursor_ < backing_buffer_.count); + GPR_DEBUG_ASSERT(backing_buffer_.count > 0); return true; } grpc_error* SliceBufferByteStream::Pull(grpc_slice* slice) { - if (shutdown_error_ != GRPC_ERROR_NONE) { + if (GPR_UNLIKELY(shutdown_error_ != GRPC_ERROR_NONE)) { return GRPC_ERROR_REF(shutdown_error_); } - GPR_ASSERT(cursor_ < backing_buffer_.count); - *slice = grpc_slice_ref_internal(backing_buffer_.slices[cursor_]); - ++cursor_; + *slice = grpc_slice_buffer_take_first(&backing_buffer_); return GRPC_ERROR_NONE; } diff --git a/src/core/lib/transport/byte_stream.h b/src/core/lib/transport/byte_stream.h index eff832515da..06503015d2c 100644 --- a/src/core/lib/transport/byte_stream.h +++ b/src/core/lib/transport/byte_stream.h @@ -99,9 +99,8 @@ class SliceBufferByteStream : public ByteStream { void Shutdown(grpc_error* error) override; private: - grpc_slice_buffer backing_buffer_; - size_t cursor_ = 0; grpc_error* shutdown_error_ = GRPC_ERROR_NONE; + grpc_slice_buffer backing_buffer_; }; //