-
Notifications
You must be signed in to change notification settings - Fork 296
feat: add measured pods support to pod-scaler #4912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
WalkthroughAdds deterministic "measured" pod sampling, labeling, CPU-request increase (capped per-workload via a node allocatable cache), and anti-affinity for measured pods; CLI flags to control measurement; metadata/reporting include measured and workload_class; Prometheus metric extended with those labels. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes ✨ Finishing touches
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)/bin/bash: line 1: golangci-lint: command not found Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@cmd/pod-scaler/admission.go`:
- Around line 546-549: The conversion from CPU Quantity to cores is wrong:
cpu.Value() already returns cores, so remove dividing by 1000; update the
cpuCores calculation in the block that reads cpu :=
node.Status.Allocatable[corev1.ResourceCPU] and sets cpuCores and availableCores
to either use cpuCores := cpu.Value() (for integer cores) or, if you need
sub-core precision, use cpuCores := float64(cpu.MilliValue())/1000.0, then
compute availableCores := cpuCores - 2 accordingly.
- Around line 462-491: The function addMeasuredPodAntiaffinity currently appends
a hard constraint to
pod.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution
which can make measured pods unschedulable; change this to add a
PreferredDuringSchedulingIgnoredDuringExecution rule instead: create a
corev1.WeightedPodAffinityTerm using the existing antiAffinityTerm and append it
to
pod.Spec.Affinity.PodAntiAffinity.PreferredDuringSchedulingIgnoredDuringExecution
(or make this configurable), ensuring you reference addMeasuredPodAntiaffinity,
antiAffinityTerm, RequiredDuringSchedulingIgnoredDuringExecution and
PreferredDuringSchedulingIgnoredDuringExecution when updating the code and
adjust the logger message accordingly.
🧹 Nitpick comments (1)
cmd/pod-scaler/admission.go (1)
509-515: Consider non-blocking initial cache load.The initial
cache.refresh()call blocks the admission server startup. If the node list API is slow or fails, this delays server readiness. Consider loading asynchronously and using the default cap until the first successful refresh.
c8f13cc to
b377146
Compare
Add --percentage-measured flag, CPU increase, anti-affinity rules, and database schema updates.
b377146 to
276bbee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cmd/pod-scaler/admission.go`:
- Around line 328-337: The cache lookup fails because meta.Measured and
meta.WorkloadType are set before calling server.recommendedRequestFor(meta),
causing FullMetadata map-key mismatches; move the assignments to meta.Measured
and meta.WorkloadType so they occur after the call to
server.recommendedRequestFor(meta) (i.e., set Measured and WorkloadType only
when recommendationExists=true and just before calling useOursIfLarger),
ensuring server.recommendedRequestFor sees the original meta key values used in
the cache while later logic (determineWorkloadType, determineWorkloadName,
workloadClass, useOursIfLarger, containers[i].Resources) still has the needed
fields populated.
🧹 Nitpick comments (3)
cmd/pod-scaler/admission_test.go (1)
557-557: Test coverage for measured pods is missing.The existing tests are properly updated to pass
false, nilfor the new parameters, preserving existing behavior validation. However, there are no new tests covering the measured pod path (isMeasured = true), including:
shouldBeMeasureddeterministic behaviormarkPodAsMeasuredlabel applicationincreaseCPUForMeasuredPodCPU adjustment logicaddMeasuredPodAntiaffinityaffinity rulesWould you like me to generate test cases for the measured pod functionality?
pkg/results/report_test.go (1)
271-271: LGTM!The test correctly passes the new parameters. Consider adding a test case that validates the JSON payload when
measured=trueandworkloadClassis non-empty to ensure the new fields serialize correctly.pkg/results/report.go (1)
100-108: Inconsistent JSON tags on struct fields.The existing fields (
WorkloadName,WorkloadType, etc.) lack explicit JSON tags while the new fields have them. This works due to Go's default JSON marshaling behavior, but explicit tags would improve consistency and clarity.♻️ Optional: Add explicit JSON tags for consistency
type PodScalerRequest struct { - WorkloadName string - WorkloadType string - ConfiguredAmount string - DeterminedAmount string - ResourceType string + WorkloadName string `json:"WorkloadName"` + WorkloadType string `json:"WorkloadType"` + ConfiguredAmount string `json:"ConfiguredAmount"` + DeterminedAmount string `json:"DeterminedAmount"` + ResourceType string `json:"ResourceType"` Measured bool `json:"measured,omitempty"` WorkloadClass string `json:"workload_class,omitempty"` }
|
/test images |
|
Scheduling required tests: |
f5ff14e to
6cd3822
Compare
…cpu-increase flag
6cd3822 to
31b5f00
Compare
|
Scheduling required tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cmd/pod-scaler/admission.go`:
- Around line 445-465: The sampling uses math/rand without seeding so randomness
repeats across restarts; seed the RNG once at startup (e.g., in main or package
init) with a time-based seed instead of seeding inside shouldBeMeasured. Ensure
rand.Seed(time.Now().UnixNano()) is called a single time during process
initialization (not inside podMutator.shouldBeMeasured or per-invocation),
leaving measuredPodLabel, podMutator, percentageMeasured and the existing
rand.Float64() call unchanged.
| const ( | ||
| measuredPodLabel = "pod-scaler.openshift.io/measured" | ||
| ciWorkloadLabel = "ci-workload" | ||
| ) | ||
|
|
||
| // shouldBeMeasured determines if a pod should be marked as measured based on the configured percentage | ||
| // Uses true randomness so all pods have a chance to be measured | ||
| func (m *podMutator) shouldBeMeasured(pod *corev1.Pod) bool { | ||
| if m.percentageMeasured <= 0 { | ||
| return false | ||
| } | ||
| // Check if label is already set (for webhook idempotency) | ||
| if pod.Labels != nil { | ||
| if value, exists := pod.Labels[measuredPodLabel]; exists { | ||
| return value == "true" | ||
| } | ||
| } | ||
| // Use true randomness - each pod gets a chance to be measured | ||
| chance := rand.Float64() * 100 | ||
| return chance < m.percentageMeasured | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Measured sampling is deterministic without RNG seeding.
math/rand defaults to a fixed seed, so pod sampling repeats across restarts and isn’t truly random. Seed once at startup if randomness is required.
✅ Seed once at startup
func admit(port, healthPort int, certDir string, client buildclientv1.BuildV1Interface, loaders map[string][]*cacheReloader, mutateResourceLimits bool, cpuCap int64, memoryCap string, cpuPriorityScheduling int64, percentageMeasured float64, measuredPodCPUIncrease float64, reporter results.PodScalerReporter) {
logger := logrus.WithField("component", "pod-scaler admission")
+ rand.Seed(time.Now().UnixNano())
logger.Infof("Initializing admission webhook server with %d loaders.", len(loaders))🤖 Prompt for AI Agents
In `@cmd/pod-scaler/admission.go` around lines 445 - 465, The sampling uses
math/rand without seeding so randomness repeats across restarts; seed the RNG
once at startup (e.g., in main or package init) with a time-based seed instead
of seeding inside shouldBeMeasured. Ensure rand.Seed(time.Now().UnixNano()) is
called a single time during process initialization (not inside
podMutator.shouldBeMeasured or per-invocation), leaving measuredPodLabel,
podMutator, percentageMeasured and the existing rand.Float64() call unchanged.
34d5c64 to
85b13a2
Compare
85b13a2 to
9ca0218
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cmd/pod-scaler/admission.go`:
- Around line 569-576: Add unit tests in admission_test.go that cover the
measured-pod anti-affinity behavior: create a measured pod input and assert that
admission.Admit (or the function handling measured pods in admission.go) appends
antiAffinityTerm to
pod.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution;
simulate node(s) with non-measured pods and verify the measured pod would be
unschedulable on those nodes (e.g., via predicate/scheduling helper or by
inspecting the anti-affinity term semantics), and assert the antiAffinityTerm's
LabelSelector matches the expected labels/selectors used in admission.go so the
selector logic is correct; include positive and negative cases to ensure the
rule is only applied to "measured" pods and contains the correct selector
fields.
🧹 Nitpick comments (3)
cmd/pod-scaler/admission.go (3)
339-374: Double cache lookup per container may impact performance.The code queries both measured and unmeasured recommendations for every container. While this ensures recommendations are found regardless of how historical data was stored, it doubles the lookup overhead. If this is intentional to handle the transition period, consider adding a comment explaining when this can be simplified to a single lookup.
627-649: Method namegetMaxCPUForWorkloadis misleading — it returns the minimum allocatable CPU.The code correctly finds the minimum allocatable CPU across nodes of each workload type (to ensure pods fit on the smallest node), but the method name suggests it returns the maximum. Consider renaming to
getMinAllocatableCPUForWorkloador updating the comment to clarify that "max CPU for a pod" is derived from the "min allocatable across nodes."
641-644: Hard-coded 10-core cap may not suit all environments.The 10-core maximum is applied regardless of actual node capacity. If larger nodes are available, this artificially limits measured pod CPU. Consider making this configurable via a flag or deriving it from the actual minimum allocatable.
| // Use required anti-affinity to force node scaling if no viable node exists | ||
| pod.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution = append( | ||
| pod.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution, | ||
| antiAffinityTerm, | ||
| ) | ||
|
|
||
| logger.Debugf("Added anti-affinity for measured pod against non-measured pods") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if there's documentation or tests explaining the anti-affinity behavior
rg -n "RequiredDuringScheduling|force.*node.*scaling|measured.*anti-affinity" --type go cmd/pod-scaler/ -B 2 -A 2Repository: openshift/ci-tools
Length of output: 1700
🏁 Script executed:
#!/bin/bash
# Look for test files related to admission or anti-affinity
fd "admission" --type f cmd/pod-scaler/ | head -20Repository: openshift/ci-tools
Length of output: 122
🏁 Script executed:
#!/bin/bash
# Check the broader context of the addMeasuredPodAntiaffinity function
cat -n cmd/pod-scaler/admission.go | sed -n '545,580p'Repository: openshift/ci-tools
Length of output: 1615
🏁 Script executed:
#!/bin/bash
# Check test coverage for anti-affinity behavior
rg -n "addMeasuredPodAntiaffinity|RequiredDuringScheduling|anti.*affinity" cmd/pod-scaler/admission_test.go -B 2 -A 5Repository: openshift/ci-tools
Length of output: 44
🏁 Script executed:
#!/bin/bash
# Check for any documentation files that might explain this behavior
fd -e md -e txt | xargs rg -l "pod-scaler|measured.*pod|anti-affinity" 2>/dev/null | head -10Repository: openshift/ci-tools
Length of output: 120
🏁 Script executed:
#!/bin/bash
# Check README files for documentation about anti-affinity and scheduling impact
cat cmd/pod-scaler/README.mdRepository: openshift/ci-tools
Length of output: 4160
🏁 Script executed:
#!/bin/bash
# Broader search for test file contents to confirm no anti-affinity test coverage
wc -l cmd/pod-scaler/admission_test.go
head -100 cmd/pod-scaler/admission_test.goRepository: openshift/ci-tools
Length of output: 2749
🏁 Script executed:
#!/bin/bash
# Check if addMeasuredPodAntiaffinity is tested at all
rg -n "addMeasuredPodAntiaffinity" --type goRepository: openshift/ci-tools
Length of output: 363
Add test coverage for measured pod anti-affinity behavior.
The hard anti-affinity constraint (RequiredDuringSchedulingIgnoredDuringExecution) is intentional per the code comment and will force cluster autoscaler to provision new nodes when no viable nodes exist for measured pods. However, this critical scheduling behavior lacks test coverage in admission_test.go. Add tests to verify:
- Anti-affinity terms are correctly applied to measured pods
- Pods cannot be scheduled on nodes with non-measured pods
- The constraint includes the proper label selector logic
Consider documenting this trade-off in the README or as a design note, since it significantly impacts scheduling behavior and cluster scaling.
🤖 Prompt for AI Agents
In `@cmd/pod-scaler/admission.go` around lines 569 - 576, Add unit tests in
admission_test.go that cover the measured-pod anti-affinity behavior: create a
measured pod input and assert that admission.Admit (or the function handling
measured pods in admission.go) appends antiAffinityTerm to
pod.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution;
simulate node(s) with non-measured pods and verify the measured pod would be
unschedulable on those nodes (e.g., via predicate/scheduling helper or by
inspecting the anti-affinity term semantics), and assert the antiAffinityTerm's
LabelSelector matches the expected labels/selectors used in admission.go so the
selector logic is correct; include positive and negative cases to ensure the
rule is only applied to "measured" pods and contains the correct selector
fields.
|
/test images |
|
Scheduling required tests: |
|
@deepsm007: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/hold for final review |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deepsm007, jupierce The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @jupierce @openshift/test-platform