Add resyncPeriod option for drift detection #658

eshulman2 · 2026-01-28T14:40:42Z

This change introduces a new resyncPeriod field in ManagedOptions that enables periodic reconciliation of resources to detect and correct drift from the desired state in OpenStack.

Motivation: Resources managed by ORC can drift from their desired state due to external modifications in OpenStack. Without drift detection, these changes go unnoticed until the next spec change triggers reconciliation. The resyncPeriod option allows users to configure periodic checks to detect and remediate such drift automatically.

Implementation:

Add resyncPeriod field to ManagedOptions API
Add GetResyncPeriod() helper method for safe access with nil handling
Modify shouldReconcile() in the generic controller to check if enough
time has passed since the last successful reconciliation
Schedule next resync when periodic resync is enabled and no other
reschedule is pending
Update all CRDs, OpenAPI schema, and documentation

Usage:

spec:
  managedOptions:
    resyncPeriod: 60m  # Reconcile every hour to detect drift

If not specified default sync is 10H. if set to 0, periodic resync is disabled.

Closes: #655

Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com

mandre

That great, it looks relatively straightforward to implement. Some points to consider:

we'll need docs, and highly visible warnings about the side effects of enabling drift detection (higher loads on the cloud).
we'll want tests as well
we probably want the controller to accept a CLI flag, to set a global resync period
for configuring per Kind resync period (say I want my a force resync of my networks every hour, while more frequent resync of floating IPs), we'll have to think about a solution. Perhaps this solution is KRO.
we should have a strategy for when the managed resource was updated out-of-band, but ORC does not have the ability to revert it to the expected state (the field is immutable in ORC for example)
what happens when a managed resource is deleted in OpenStack?
many more questions

Ideally, we'll discuss all these points in a design document.

This change introduces a new `resyncPeriod` field in `ManagedOptions` that enables periodic reconciliation of resources to detect and correct drift from the desired state in OpenStack. Motivation: Resources managed by ORC can drift from their desired state due to external modifications in OpenStack. Without drift detection, these changes go unnoticed until the next spec change triggers reconciliation. The resyncPeriod option allows users to configure periodic checks to detect and remediate such drift automatically. Implementation: - Add `resyncPeriod` field (metav1.Duration) to ManagedOptions API - Add GetResyncPeriod() helper method for safe access with nil handling - Modify shouldReconcile() in the generic controller to check if enough time has passed since the last successful reconciliation - Schedule next resync when periodic resync is enabled and no other reschedule is pending - Update all CRDs, OpenAPI schema, and documentation Usage: ```yaml spec: managedOptions: resyncPeriod: 1h # Reconcile every hour to detect drift ``` If not specified or set to 0, periodic resync is disabled (default behavior unchanged). Closes: k-orc#655 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

eshulman2 · 2026-01-28T15:32:27Z

I'll try to reply some right now:

we'll need docs, and highly visible warnings about the side effects of enabling drift detection (higher loads on the cloud).

Added a warning in docs

we'll want tests as well

We need to think how we want to test it but I defiantly agree we should figure out a way to gate it. For now as a draft tested locally on my computer.

we should have a strategy for when the managed resource was updated out-of-band, but ORC does not have the ability to revert it to the expected state (the field is immutable in ORC for example)

I think in this case it just won't fix the drift but it won't trigger another update I believe as when we are updating we compare the fields that we can change

what happens when a managed resource is deleted in OpenStack?

This case would behave similarly to reconciling before the resource exist and ORC will try to create it.
EDIT: seems like this is problematic as well

dlaw4608

This is nice, I played around with it for a while locally, works great!!

dlaw4608 · 2026-01-28T17:18:04Z

internal/controllers/generic/reconciler/controller.go


 	if osResource == nil {
 		// Programming error: if we don't have a resource we should either have an error or be waiting on something
 		return reconcileStatus.WithError(fmt.Errorf("oResource is not set, but no wait events or error"))


noticed a small typo not from this PR, should be "osResource is not set ...." up to you @eshulman2 if you want to correct it.

Recreate resource in case it was deleted by something external to ORC. this solves the issue when deleting with external but does raise concerns I am afraid that in case of split brain or other edge case the resource will be created over and over causing a catastrophic failure

eshulman2 · 2026-01-28T17:47:02Z

@mandre I added a second commit to re-create the resource on external delete. I must say I'm a bit uncomfortable with the possible risk profile it might add in edge cases, but I'll leave it here for now for reference and discussion.

github-actions bot added the semver:minor Backwards-compatible change label Jan 28, 2026

mandre reviewed Jan 28, 2026

View reviewed changes

mandre marked this pull request as draft January 28, 2026 15:08

eshulman2 force-pushed the drift_detect branch from 1e62aa9 to 1db80df Compare January 28, 2026 15:13

eshulman2 mentioned this pull request Jan 28, 2026

Drift detection #655

Open

eshulman2 force-pushed the drift_detect branch from 1db80df to d38ac01 Compare January 28, 2026 15:27

dlaw4608 reviewed Jan 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add resyncPeriod option for drift detection #658

Add resyncPeriod option for drift detection #658

Uh oh!

eshulman2 commented Jan 28, 2026 •

edited

Loading

Uh oh!

mandre left a comment •

edited

Loading

Uh oh!

eshulman2 commented Jan 28, 2026 •

edited

Loading

Uh oh!

dlaw4608 left a comment

Uh oh!

dlaw4608 Jan 28, 2026

Uh oh!

eshulman2 commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add resyncPeriod option for drift detection #658

Are you sure you want to change the base?

Add resyncPeriod option for drift detection #658

Uh oh!

Conversation

eshulman2 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mandre left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eshulman2 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dlaw4608 left a comment

Choose a reason for hiding this comment

Uh oh!

dlaw4608 Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

eshulman2 commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eshulman2 commented Jan 28, 2026 •

edited

Loading

mandre left a comment •

edited

Loading

eshulman2 commented Jan 28, 2026 •

edited

Loading