-
Notifications
You must be signed in to change notification settings - Fork 254
CLDSRV-836 Fix deep healthcheck fail status #6055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development/9.2
Are you sure you want to change the base?
CLDSRV-836 Fix deep healthcheck fail status #6055
Conversation
Now fail if ALL backends of only one client fails. Previously, the deep healthcheck would fail if ALL backends/locations failed globally across all clients (data, metadata, vault, kms). This change modifies the logic to fail if ANY client has ALL its backends/locations failing. This ensures: 1. For data backend with multiple sproxyd location constraints: - Returns HTTP 200 if at least ONE location is healthy - Returns HTTP 500 only if ALL locations fail 2. Each client (data, metadata, vault, kms) is evaluated independently - If ALL locations of the data client fail, overall check fails - If ALL locations of metadata fail, overall check fails - etc. The new logic uses: - `results.some()` to check across clients - `keys.every()` within each client to check all its locations
Hello bourgoismickael,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Request integration branchesWaiting for integration branch creation to be requested by the user. To request integration branches, please comment on this pull request with the following command: Alternatively, the |
Codecov Report❌ Patch coverage is
Additional details and impacted files
... and 2 files with indirect coverage changes @@ Coverage Diff @@
## development/9.2 #6055 +/- ##
===================================================
- Coverage 84.41% 84.38% -0.03%
===================================================
Files 206 206
Lines 13016 13019 +3
===================================================
- Hits 10987 10986 -1
- Misses 2029 2033 +4
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes the deep healthcheck logic to fail when ANY client has ALL its backends/locations failing, rather than only failing when ALL backends across ALL clients fail. This ensures better detection of client-specific failures, particularly for multi-location data backends.
Changes:
- Modified the failure detection logic from checking all backends globally to checking each client independently
- Added empty client handling to skip clients with no backends
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const fail = results.some(clientResult => { | ||
| const keys = Object.keys(clientResult); | ||
| // If this client has no keys, skip it | ||
| if (keys.length === 0) { | ||
| return false; | ||
| } | ||
| // Check if ALL backends/locations of this client have errors | ||
| return keys.every(k => | ||
| // if there is an error from an external backend, | ||
| // only return a 500 if it is on startup | ||
| // (flightCheckOnStartUp set to true) | ||
| clientResult[k].error && (flightCheckOnStartUp || !clientResult[k].external) | ||
| ); | ||
| }); |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR introduces a significant change in the healthcheck failure logic - now failing when ANY client has ALL its backends failing (instead of only when ALL backends across ALL clients fail). This new behavior should be covered by tests to ensure it works as expected. Consider adding test cases for scenarios such as:
- All backends of the data client failing while metadata client is healthy (should fail)
- One data backend location failing while another is healthy and metadata is healthy (should succeed)
- All backends of multiple clients failing (should fail)
Now fail if ALL backends of only one client fails.
Previously, the deep healthcheck would fail if ALL backends/locations
failed globally across all clients (data, metadata, vault, kms).
This change modifies the logic to fail if ANY client has ALL its
backends/locations failing. This ensures:
For data backend with multiple sproxyd location constraints:
Each client (data, metadata, vault, kms) is evaluated independently
The new logic uses:
results.some()to check across clientskeys.every()within each client to check all its locations