mirror of https://github.com/grpc/grpc.git
[PSM Interop] Increase k8s startup probe total time (#32875)
Previously, we didn't configure the failureThreshold, so it used its default value. The final `startupProbe` looked like this: ```json { "startupProbe": { "failureThreshold": 3, "periodSeconds": 3, "successThreshold": 1, "tcpSocket": { "port": 8081 }, "timeoutSeconds": 1 } ``` Because of it, the total time before k8s killed the container was 3 times `failureThreshold` * 3 seconds wait between probes `periodSeconds` = 9 seconds total (±3 seconds waiting for the probe response). This greatly affected PSM Security test server, some implementations of which waited for the ADS stream to be configured before starting listening on the maintenance port. This lead for the server container being killed for ~7 times before a successful startup: ``` 15:55:08.875586 "Killing container with a grace period" 15:53:38.875812 "Killing container with a grace period" 15:52:47.875752 "Killing container with a grace period" 15:52:38.874696 "Killing container with a grace period" 15:52:14.874491 "Killing container with a grace period" 15:52:05.875400 "Killing container with a grace period" 15:51:56.876138 "Killing container with a grace period" ``` These extra delays lead to PSM security tests timing out. ref b/277336725pull/32849/head^2
parent
a2c89d0b24
commit
f2a7f6d51b
4 changed files with 13 additions and 1 deletions
Loading…
Reference in new issue