[chttp2] Fix ref-counting bug in Chttp2ServerListener around shutdown (#37225)

Noticed on a Core End2End test failure https://btx.cloud.google.com/invocations/dc3bf84d-e6ed-4b32-a24c-12489f981e46/targets/%2F%2Ftest%2Fcore%2Fend2end:cancel_with_status_test@poller%3Depoll1;config=56f5b09615e325097b100b58c41171656571290519a83c5d89a6067ef0283d46/log

```
F0000 00:00:1721017820.001684      87 tcp_server_posix.cc:354] Check failed: !s->shutdown
*** Check failure stack trace: ***
    @     0x7f32578da0e4  absl::lts_20240116::log_internal::LogMessage::SendToLog()
    @     0x7f32578d9a94  absl::lts_20240116::log_internal::LogMessage::Flush()
    @     0x7f32578da589  absl::lts_20240116::log_internal::LogMessageFatal::~LogMessageFatal()
    @     0x7f3257e340a1  tcp_server_unref()
    @     0x7f3258fcba8e  grpc_core::Chttp2ServerListener::ActiveConnection::~ActiveConnection()
    @     0x7f3258fd19e7  grpc_event_engine::experimental::MemoryAllocator::New<>()::Wrapper::~Wrapper()
    @     0x7f3258fcc998  grpc_core::Chttp2ServerListener::OnAccept()
    @     0x7f3257e34962  absl::lts_20240116::internal_any_invocable::LocalInvoker<>()
    @     0x7f3257da6475  grpc_event_engine::experimental::PosixEngineListenerImpl::AsyncConnectionAcceptor::NotifyOnAccept()::$_1::operator()()
    @     0x7f3257da4437  grpc_event_engine::experimental::PosixEngineListenerImpl::AsyncConnectionAcceptor::NotifyOnAccept()
    @     0x7f3257da5fef  absl::lts_20240116::base_internal::Callable::Invoke<>()
    @     0x7f3257dca50a  grpc_event_engine::experimental::PosixEngineClosure::Run()
    @     0x7f3257c9013e  grpc_event_engine::experimental::WorkStealingThreadPool::ThreadState::Step()
    @     0x7f3257c8fe48  grpc_event_engine::experimental::WorkStealingThreadPool::ThreadState::ThreadBody()
    @     0x7f3257c906df  grpc_event_engine::experimental::WorkStealingThreadPool::WorkStealingThreadPoolImpl::StartThread()::$_0::__invoke()
    @     0x7f32579a106c  grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix()::{lambda()#1}::__invoke()
    @     0x7f3257358609  start_thread
```

https://github.com/grpc/grpc/pull/36563 changed the refcounting mechanism incorrectly and we ended up taking a ref on the tcp server outside the critical region, resulting in a time-of-check-to-time-of-use bug, where we could end up reffing the tcp server when it is already 0, i.e., when the listener has already been shutdown. This results in an attempt to destroy the tcp server twice and an eventual crash.

Closes #37225

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37225 from yashykt:FixChttp2Bug bc1e8dfd34
PiperOrigin-RevId: 654850991
pull/37281/head
Yash Tibrewal 4 months ago committed by Copybara-Service
parent b1896a2136
commit 7c8402cafb
  1. 19
      src/core/ext/transport/chttp2/server/chttp2_server.cc

@ -616,9 +616,6 @@ void Chttp2ServerListener::ActiveConnection::Start(
RefCountedPtr<Chttp2ServerListener> listener,
OrphanablePtr<grpc_endpoint> endpoint, const ChannelArgs& args) {
listener_ = std::move(listener);
if (listener_->tcp_server_ != nullptr) {
grpc_tcp_server_ref(listener_->tcp_server_);
}
RefCountedPtr<HandshakingState> handshaking_state_ref;
{
MutexLock lock(&mu_);
@ -869,16 +866,20 @@ void Chttp2ServerListener::OnAccept(void* arg, grpc_endpoint* tcp,
// connection manager has changed.
if (!self->shutdown_ && self->is_serving_ &&
connection_manager == self->connection_manager_) {
// This ref needs to be taken in the critical region after having made
// sure that the listener has not been Orphaned, so as to avoid
// heap-use-after-free issues where `Ref()` is invoked when the ref of
// tcp_server_ has already reached 0. (Ref() implementation of
// Chttp2ServerListener is grpc_tcp_server_ref().)
// The ref for both the listener and tcp_server need to be taken in the
// critical region after having made sure that the listener has not been
// Orphaned, so as to avoid heap-use-after-free issues where `Ref()` is
// invoked when the listener is already shutdown. Note that the listener
// holds a ref to the tcp_server but this ref is given away when the
// listener is orphaned (shutdown).
if (self->tcp_server_ != nullptr) {
grpc_tcp_server_ref(self->tcp_server_);
}
listener_ref = self->RefAsSubclass<Chttp2ServerListener>();
self->connections_.emplace(connection.get(), std::move(connection));
}
}
if (connection == nullptr) {
if (connection == nullptr && listener_ref != nullptr) {
connection_ref->Start(std::move(listener_ref), std::move(endpoint), args);
}
}

Loading…
Cancel
Save