HDDS-14498. Zero Downtime Upgrade Design (ZDU)#9664
Open
sodonnel wants to merge 14 commits intoapache:masterfrom
Open
HDDS-14498. Zero Downtime Upgrade Design (ZDU)#9664sodonnel wants to merge 14 commits intoapache:masterfrom
sodonnel wants to merge 14 commits intoapache:masterfrom
Conversation
3 tasks
jojochuang
reviewed
Jan 28, 2026
| @@ -0,0 +1,523 @@ | |||
| --- | |||
| jira: HDDS-3331 | |||
Contributor
There was a problem hiding this comment.
My local hugo failed to start until I added the date tag.
Suggested change
| jira: HDDS-3331 | |
| jira: HDDS-3331 | |
| date: 2026-01-23 |
And looks like it also needs these two
Suggested change
| jira: HDDS-3331 | |
| title: Zero Downtime Upgrade (ZDU) | |
| summary: New and improved framework to allow rolling upgrade without cluster downtime. |
ptlrs
reviewed
Jan 29, 2026
Comment on lines
+171
to
+175
| 6. The finalize command is sent to SCM by the admin - this is what is used to switch the cluster to act as the new version. Upon receipt of the finalize command: | ||
| 7. SCM will finalize itself over Ratis, saving the new finalized version. | ||
| 8. It will notify datanodes over the heartbeat to finalize. | ||
| 9. After all healthy datanodes have been finalized, OM can be finalized. To do this, OM will have been polling SCM periodically to see if it should finalize. Only after SCM and all datanodes have been finalized will OM get a “ready to finalize” response from the poll. The OM leader will then send a finalize command over Ratis to all OMs. | ||
| 10. As OM is the entry point to the cluster for external clients, finalizing OM unlocks any new features in the upgraded version. |
Contributor
There was a problem hiding this comment.
These don't render correctly
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


What changes were proposed in this pull request?
This is a design document for Ozone Zero Downtime Upgrade (ZDU).
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14498
How was this patch tested?
N/A