[XdsClient] fix ubsan failure (#38726)

This fixes a ubsan failure introduced in #38698.  In xDS fallback, when a higher-priority server comes back online, we [remove the lower-priority channels from the authority state](686fc9dbeb/src/core/xds/xds_client/xds_client.cc (L535)), but unreffing the channels triggers a call to `MaybeRemoveUnsubscribedCacheEntriesForTypeLocked()`, which [attempts to access the list of channels](686fc9dbeb/src/core/xds/xds_client/xds_client.cc (L1788)) while we're in the process of modifying it.

Example failure:

https://btx.cloud.google.com/invocations/69469281-f334-4a1f-91b4-3eb8905b63f4/targets/%2F%2Ftest%2Fcpp%2Fend2end%2Fxds:xds_fallback_end2end_test@experiment%3Dno_server_listener;config=79d74d08dc8cd749c211bbf112a92eee46adbf3f6203bc27099b142ea4e7aac9/log

Closes #38726

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/38726 from markdroth:xds_client_unsubscribe_resubscribe_race_fix 207c37c31f
PiperOrigin-RevId: 726322804
pull/35264/merge
Mark D. Roth 1 week ago committed by Copybara-Service
parent 81e51de2e4
commit 822f9b1519
  1. 7
      src/core/xds/xds_client/xds_client.cc

@ -532,6 +532,13 @@ void XdsClient::XdsChannel::SetHealthyLocked() {
<< "[xds_client " << xds_client_.get() << "] authority " << authority
<< ": Falling forward to " << server_.server_uri();
// Lower priority channels are no longer needed, connection is back!
// Note that we move the lower priority channels out of the vector
// before we unref them, or else
// MaybeRemoveUnsubscribedCacheEntriesForTypeLocked() will try to
// access the vector while we are modifying it.
std::vector<RefCountedPtr<XdsChannel>> channels_to_unref(
std::make_move_iterator(channel_it + 1),
std::make_move_iterator(channels.end()));
channels.erase(channel_it + 1, channels.end());
}
}

Loading…
Cancel
Save