Skip to content

Allow dynamic quota creation and removal#287

Open
QuanMPhm wants to merge 2 commits intonerc-project:mainfrom
QuanMPhm:ops_1391/final
Open

Allow dynamic quota creation and removal#287
QuanMPhm wants to merge 2 commits intonerc-project:mainfrom
QuanMPhm:ops_1391/final

Conversation

@QuanMPhm
Copy link
Contributor

Closes nerc-project/operations#1391. This is how I would suggest to review this PR.

Two CLI commands have been added, add_quota_to_resource.py and remove_quota_from_resource.py. I would suggest understanding those two commands first. These commands allow us to dynamically add/remove quotas instead of having them hard-coded as they are currently done. These commands don't impact the quota objects in the clusters, nor the quota attributes in allocations. Their full impact is illustrated when used within the typical user workflow, or in tandem with validate_allocations.py. I would now suggest checking the changes to functional/openshift/test_allocations.py to see the full implications of this PR. The other functional test cases only contain minor changes.

Afterwards, tasks.py, validate_allocations.py, and the allocator base and subclasses should be reviewed. They are the main consumers of quota information. All other changes relatively minor.

This is a draft for now since I have some questions, and the tests are failing. I just wanted people to know my general direction with this feature.

I will wait for people's feedback before continuing work on this PR, since I assume substantial feedback will be given.

defaults={"value": json.dumps(new_quota_dict)},
)

# TODO (Quan): Dict update allows migration of existing quotas. This is fine?
Copy link
Contributor Author

@QuanMPhm QuanMPhm Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knikolla @jtriley This is a pre-existing feature, so I assume the answer is yes. Just to make sure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I fully understand this comment. Can you elaborate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently allow migrating the quota's cluster label (i.e "limits.cpu" for Openshift CPUs) by changing the hardcoded values in the QUOTA_KEY_MAPPING of the appropriate allocator. This migration feature is demonstrated in the functional test that I linked.

Below my TODO comment:

if not created:
            available_quotas_dict = json.loads(available_quotas_attr.value)
            available_quotas_dict.update(new_quota_dict)
            QuotaSpecs.model_validate(available_quotas_dict)  # Validate uniqueness
            available_quotas_attr.value = json.dumps(available_quotas_dict)
            available_quotas_attr.save()

I wanted to show that this migration feature will still be available, because if you decide to add the same quota to the same resource, available_quotas_dict.update(new_quota_dict) means you can update/migrate everything about the quota, including its cluster label (with the exception of the display name, which you've mentioned and I responded here).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.

"OpenStack Storage",
openstack_nese_storage_rate,
)
# TODO (Quan): An illustration of how billing could be simplified. Shuold I follow with this?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knikolla I couldn't do the same refactoring for the Openshift allocations because different storages have their own rates. I could have refactored the code further to circumvent that issue, but I didn't want the PR to be too long.

},
)

# TODO (Quan): What happens when a quota is removed? Should the attribute be removed from Coldfront?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knikolla @jtriley @joachimweyl This also has implications for billing storage. This test case is failing here since I would like people's consensus on desired behavior.

Copy link

@joachimweyl joachimweyl Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hunch is no, but I want to wait for @knikolla input

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now just have the quota be removed from the Resource Attribute but untouched in the allocations.

@QuanMPhm QuanMPhm marked this pull request as draft January 21, 2026 18:05
While not entirely clear, it seems the recent Pandas relase (3.0.0)
changed how casting to decimal types works, causing some invoicing
code to throw errors, specifically calls to read_csv()

Seperating loading of the CSV and casting seems to fix this
@QuanMPhm QuanMPhm force-pushed the ops_1391/final branch 6 times, most recently from b3c58d8 to 35273aa Compare January 29, 2026 17:08
@QuanMPhm
Copy link
Contributor Author

@knikolla I addressed all your suggestions on Slack except one:

To migrate the display name of an attribute
Before, since the attributes were stored in code, the migrations were also stored in code
Now, since the adding of new quota is a command, migrating the display name of an attribute should also be a command.

May I ask that I implement this feature in a subsequent PR, to prevent this PR from bloating even more? If not, I will implement this after I receive answers for my questions above.

@joachimweyl
Copy link

What is the impact of this omission?

@QuanMPhm
Copy link
Contributor Author

QuanMPhm commented Jan 29, 2026

@joachimweyl The impact will be that to change the display names of attributes (the names that users will see in the Coldfront UI, i.e OpenShift Limit on CPU Quota) will be a bit inconvenient. An admin will have to do some manual renaming in Coldfront. Still doable, but not in a way that's quick and programmatic. We ideally want a CLI command that makes renaming easier, but I didn't want this PR to take too long to review because of the February maintenance.

@joachimweyl
Copy link

joachimweyl commented Jan 30, 2026

Makes sense to me.

@QuanMPhm QuanMPhm marked this pull request as ready for review February 4, 2026 18:24
"""
return self.static_quota + self.multiplier * int(quantity)

def formatted_quota(self, quota_value: int) -> int | str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike that this can return two different types.

"""
Fields:
- quota_label: human readable label for the quota (must be unique across the dict)
- default_quota: default quota value (int, >= 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what the purpose of this is? The default is equal to (quantity * multiplier) + static_quota, no? It seems unused anywhere else besides the command line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see I have been hallucinating and though I needed it :(

def _get_network_quota(self, quotas, project_id):
network_quota = self.network.show_quota(project_id)["quota"]
for k in self.QUOTA_KEY_MAPPING["network"]["keys"].values():
for cf_k in self.SERVICE_QUOTA_MAPPING["network"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have used the resource_type field of the QuotaSpec here. This will result in an error if not all quotaspecs are defined for OpenStack resources.

dest="quota_label",
type=str,
required=True,
help="Human-readable quota_label for this quota (must be unique).",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this description is right. This maps to the key on the cluster sides.

defaults={"value": json.dumps(new_quota_dict)},
)

# TODO (Quan): Dict update allows migration of existing quotas. This is fine?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I fully understand this comment. Can you elaborate?

return self

@cached_property
def storage_quotas(self) -> dict[str, QuotaSpec]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more generic get_quotas_by_type would be more flexible and allow you to query other kinds of resource types like compute, network, etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_openstack_resource and add_openshift_resource now don't provide ALL the required QuotaSpecs with the EXACT same multiplier and static values as they are now, otherwise you are changing the behavior.

YOU NEED TO provide a separate python command or shell file that registers ALL the values as they are now.

Otherwise, as it is you haven't provided a smooth transition from the current system to the new Dynamic Quota system and an admin would need to type out a lot of error prone commands manually to make this upgrade work.

Copy link
Contributor

@naved001 naved001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @QuanMPhm! Some basic questions in there as I try to refresh my memory of coldfront. Will do another pass.

class Command(BaseCommand):
def add_arguments(self, parser):
parser.add_argument(
"--display_name",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have an underscore instead of a dash.

--display_name -> --display-name

help="The default quota value for the storage attribute. In GB",
)
parser.add_argument(
"--resource_name",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--resource_name -> --resource-name

type=str,
default="",
help="Name of quota as it appears on invoice. Required if --is-storage-type is set.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how come you didn't specify dest= for some of these arguments?

Copy link
Contributor Author

@QuanMPhm QuanMPhm Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally wouldn't include dest=, and didn't review closely enough what Copilot generated this code for me. I've removed the dest=. Apologies

def handle(self, *args, **options):
if options["resource_type"] == "storage" and not options["invoice_name"]:
logger.error(
"--invoice-name must be provided when storage type is `storage`."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"when resource type is storage."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is any quota that is relevant for storage billing should have the resource type storage, such as:

QUOTA_REQUESTS_IBM_STORAGE = "OpenShift Request on IBM Storage Quota (GiB)"
QUOTA_REQUESTS_NESE_STORAGE = "OpenShift Request on NESE Storage Quota (GiB)"

"--invoice-name",
type=str,
default="",
help="Name of quota as it appears on invoice. Required if --is-storage-type is set.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's --is-storage-type? Did you mean --resource-type is set to storage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes. My bad

else options["name"],
)

# Add common Openshift resources (cpu, memory, etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remind how were these resources created before this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the information for these quotas are spread in multiple places in the repo. The display names are in attributes.py, the multiplier and static quantities are in tasks.py, other info in other places. The allocation attributes for these quotas were loaded by register_cloud_attributes.py, which consumes the attributes defined in attributes.py.

A by-product of this PR is that now all that info is created and stored in one place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Storage types in ColdFront

4 participants