Skip to content

Conversation

@labbott
Copy link
Contributor

@labbott labbott commented Jan 27, 2026

No description provided.

@labbott labbott requested a review from andrewjstone January 27, 2026 17:23
@labbott labbott force-pushed the labbott/measurement_diagnose branch from 6c9afd8 to b73423d Compare January 27, 2026 17:27
@labbott labbott force-pushed the labbott/measurement_diagnose branch from b73423d to d5b04c8 Compare January 27, 2026 21:28
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at https://mozilla.org/MPL/2.0/.

//! Diagnose problems with reference measurements on a specific sled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you expecting to copy this over to the global zone to run? Presumably this is a separate executable because it requires using IPCC.

I think since this is helpful for both test and production systems, it would be most useful to integrate it into omdb and not have to worry about loading it at support time. Then it could also be used as part of support bundle gathering. If I were doing that I'd provide a sled-agent lockstep API and then allow calling that from omdb for any sled-agent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially started writing this as part of omdb. It looked to me like omdb is designed to be run from the switch zone to diagnose problems with specific sleds by calling API functions. The current sled-agent API also only returns file paths which are useful only on a specific sleds. The full diagnosis does require IPCC which also requires running on a specific sled.

I agree it could be useful to have this information in a support bundle but if I'm understanding what you're proposing we'd need an API to call some of this functionality. I'm wary to expose some of this directly in sled-agent, unless I misunderstood what you were suggesting. Or can I just put this in omdb with the expectation that will get run from a sled's global zone?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, omdb runs from the switch zone. I was indeed saying that we should provide a sled-agent API on the lockstep (unversioned) server for providing this information. I don't think there's anything confidential here, and so if we expect this to be a long lived tool, I'd consider trying something like that.

However, now that I write this, I don't think this can be a long-lived tool as currently written. It works now because we start the sled-agent regardless of whether attestation succeeds. Once we gate sprockets connections on attestation, we won't be able to start sled-agent, and won't be able to access it's servers on the underlay network - lockstep or versioned. We'd instead need some sort of bootstrap agent API for getting inventory. If we wanted to build this to be used by omdb, we'd need yet another bootstrap API. This seems somewhat wrong to me, but it is possible since the switch zone can access the bootstrap network.

With all that said, it may be that this is a short lived tool until we start gating sprockets connections on attestations. If that is the case, then yeah, it's not worth the extra effort to move this into OMDB. If it's a longer lived tool to debug problems with sprockets, we may also want to keep it out of OMDB, but it will also likely have to be changed to look at the filesystem or something other than the sled-agent API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants