mirror of https://github.com/grpc/grpc.git
[chttp2] Fix ref-counting bug in Chttp2ServerListener around shutdown (#37225)
Noticed on a Core End2End test failure https://btx.cloud.google.com/invocations/dc3bf84d-e6ed-4b32-a24c-12489f981e46/targets/%2F%2Ftest%2Fcore%2Fend2end:cancel_with_status_test@poller%3Depoll1;config=56f5b09615e325097b100b58c41171656571290519a83c5d89a6067ef0283d46/log
```
F0000 00:00:1721017820.001684 87 tcp_server_posix.cc:354] Check failed: !s->shutdown
*** Check failure stack trace: ***
@ 0x7f32578da0e4 absl::lts_20240116::log_internal::LogMessage::SendToLog()
@ 0x7f32578d9a94 absl::lts_20240116::log_internal::LogMessage::Flush()
@ 0x7f32578da589 absl::lts_20240116::log_internal::LogMessageFatal::~LogMessageFatal()
@ 0x7f3257e340a1 tcp_server_unref()
@ 0x7f3258fcba8e grpc_core::Chttp2ServerListener::ActiveConnection::~ActiveConnection()
@ 0x7f3258fd19e7 grpc_event_engine::experimental::MemoryAllocator::New<>()::Wrapper::~Wrapper()
@ 0x7f3258fcc998 grpc_core::Chttp2ServerListener::OnAccept()
@ 0x7f3257e34962 absl::lts_20240116::internal_any_invocable::LocalInvoker<>()
@ 0x7f3257da6475 grpc_event_engine::experimental::PosixEngineListenerImpl::AsyncConnectionAcceptor::NotifyOnAccept()::$_1::operator()()
@ 0x7f3257da4437 grpc_event_engine::experimental::PosixEngineListenerImpl::AsyncConnectionAcceptor::NotifyOnAccept()
@ 0x7f3257da5fef absl::lts_20240116::base_internal::Callable::Invoke<>()
@ 0x7f3257dca50a grpc_event_engine::experimental::PosixEngineClosure::Run()
@ 0x7f3257c9013e grpc_event_engine::experimental::WorkStealingThreadPool::ThreadState::Step()
@ 0x7f3257c8fe48 grpc_event_engine::experimental::WorkStealingThreadPool::ThreadState::ThreadBody()
@ 0x7f3257c906df grpc_event_engine::experimental::WorkStealingThreadPool::WorkStealingThreadPoolImpl::StartThread()::$_0::__invoke()
@ 0x7f32579a106c grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix()::{lambda()#1}::__invoke()
@ 0x7f3257358609 start_thread
```
https://github.com/grpc/grpc/pull/36563 changed the refcounting mechanism incorrectly and we ended up taking a ref on the tcp server outside the critical region, resulting in a time-of-check-to-time-of-use bug, where we could end up reffing the tcp server when it is already 0, i.e., when the listener has already been shutdown. This results in an attempt to destroy the tcp server twice and an eventual crash.
Closes #37225
COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/37225 from yashykt:FixChttp2Bug bc1e8dfd34
PiperOrigin-RevId: 654850991
pull/37281/head
parent
b1896a2136
commit
7c8402cafb
1 changed files with 10 additions and 9 deletions
Loading…
Reference in new issue