Skip to content

Conversation

@casparvl
Copy link

This was requested in #2820 , although that was eventually closed with a suggestion for a workaround.

However, I hit it again when trying to use reframe.utils.find_modules(substr, ...) in combination with wanting to specify additional features. Example use cases (and there's many more one can come up with):

  • I may have a software X for which want to not run CPU tests on a GPU node. I normally achieve that by assigning the CPU feature to my CPU partitions (but not to my GPU partitions) and then requesting the CPU feature from my CPU-only tests.
  • I may have two variants of a bandwidth test (which needs module Y): one sending small packets, one sending larger packets. I only want to run larger one only on partitions that have the infiniband feature.

In both cases, we want to test all modules with X or Y respectively that are available on our system, hence we want to use the find_modules function. However, the first element of the tuple this returns is the system:partition string for which it found the module matching the requested substring. This is clearly meant to be used in valid_systems directly (as is confirmed by the example in the docs), but for the examples above, you may want to only run on a subset of these system:partition combinations (based on the required features. It would be very natural to then do:

module_info = parameter(find_modules('X'))
...
self.valid_systems - [module_info[0] + ' +feat1']

To make sure that a partition is only valid if it provides the right module and the right feature.

It turned out that the code changes needed are very limited (it seems like a lot because the indentation changed, but the key part is:

  • not splitting the system:partition case from the feature/extra specification case
  • Handling the system:partition case with an extra elif on the subspec:
            elif ':' in subspec and not subspec.startswith(('+', '-', '%')):
                # If there is a system:partition specified, make sure it
                # matches one of the items in valid_matches
                syspart_match = True if subspec in valid_matches else False

And adding that syspart_match to the return logic:

        # If the partition has all the plus features, none of the minus
        # all of the properties and the system:partition spec (if any)
        # matched, this partition is valid
        if (
            have_plus_feats and not have_minus_feats and have_props
            and syspart_match
        ):
            return True

Note that this logic does not assume the system:partition to strictly come before or after the features/extras, so I've adapted the valid syntax specification to allow both:

_VALID_SYS_SYNTAX = rf'^(({_FKV}\s+)*{_S}(\s+{_FKV})*|{_FKV}(\s+{_FKV})*)$'

If anything, I feel the code has become cleaner by not handling system:partition as a special, separate case, but as 'just another item' that may occur as a subspec.

@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.37%. Comparing base (da2fbb3) to head (89259dd).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #3613   +/-   ##
========================================
  Coverage    91.37%   91.37%           
========================================
  Files           62       62           
  Lines        13484    13484           
========================================
  Hits         12321    12321           
  Misses        1163     1163           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@casparvl
Copy link
Author

casparvl commented Jan 21, 2026

I've done some concrete testing to prove this works. Note that I have 4 partitions (rome, genoa, gpu_A100, gpu_H100), of which the first two have the cpu feature assigned. Each partition has two environments: EESSI-2023.06 and EESSI-2025.06.

**Case 1: traditional `system:partition` assignment**
module_info = parameter(find_modules('SciPy-bundle'))
...
        s, e, m = self.module_info
        self.valid_systems = [s]

Output:

[       OK ] (1/8) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') /586a9dfe @snellius:rome+EESSI-2023.06
...
[       OK ] (2/8) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') /fe4be974 @snellius:rome+EESSI-2025.06
...
[       OK ] (3/8) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') /93bc1dc5 @snellius:genoa+EESSI-2023.06
...
[       OK ] (4/8) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') /3034e72c @snellius:genoa+EESSI-2025.06
...
[       OK ] (5/8) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') /80754277 @snellius:gpu_A100+EESSI-2023.06
...
[       OK ] (6/8) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') /8f6a7e95 @snellius:gpu_A100+EESSI-2025.06
...
[       OK ] (7/8) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') /e6d136b9 @snellius:gpu_H100+EESSI-2023.06
...
[       OK ] (8/8) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') /bf9cd015 @snellius:gpu_H100+EESSI-2025.06
...
[----------] all spawned checks have finished
**Case 2: Combining system:partition with +feat1**
module_info = parameter(find_modules('SciPy-bundle'))
...
        s, e, m = self.module_info
        self.valid_systems = [s + ' +cpu']

Output:

[       OK ] (1/4) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /586a9dfe @snellius:rome+EESSI-2023.06
...
[       OK ] (2/4) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /fe4be974 @snellius:rome+EESSI-2025.06
...
[       OK ] (3/4) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /93bc1dc5 @snellius:genoa+EESSI-2023.06
...
[       OK ] (4/4) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /3034e72c @snellius:genoa+EESSI-2025.06
...
[----------] all spawned checks have finished
**Case 3: Combining system:partition with -feat1**
module_info = parameter(find_modules('SciPy-bundle'))
...
        s, e, m = self.module_info
        self.valid_systems = [s + ' +cpu']

Output:

[       OK ] (1/4) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /80754277 @snellius:gpu_A100+EESSI-2023.06
...
[       OK ] (2/4) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /8f6a7e95 @snellius:gpu_A100+EESSI-2025.06
...
[       OK ] (3/4) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /e6d136b9 @snellius:gpu_H100+EESSI-2023.06
...
[       OK ] (4/4) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /bf9cd015 @snellius:gpu_H100+EESSI-2025.06
...
[----------] all spawned checks have finished
**Case 4: Using only features**

This case clearly doesn't make any sense, because it'll create tests where the first element of module_info is X and then the partition that will be used is Y, but just to prove that a valid_systems with only features still works:

module_info = parameter(find_modules('SciPy-bundle'))
...
        s, e, m = self.module_info
        self.valid_systems = ['+cpu']

Output:

[       OK ] ( 1/16) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /586a9dfe @snellius:rome+EESSI-2023.06
...
[       OK ] ( 2/16) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /586a9dfe @snellius:genoa+EESSI-2023.06
...
[       OK ] ( 3/16) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /fe4be974 @snellius:rome+EESSI-2025.06
...
[       OK ] ( 4/16) EESSI_ModuleFinder2 %module_info=('snellius:rome', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /fe4be974 @snellius:genoa+EESSI-2025.06
...
[       OK ] ( 5/16) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /93bc1dc5 @snellius:rome+EESSI-2023.06
...
[       OK ] ( 6/16) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /93bc1dc5 @snellius:genoa+EESSI-2023.06
...
[       OK ] ( 7/16) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /3034e72c @snellius:rome+EESSI-2025.06
...
[       OK ] ( 8/16) EESSI_ModuleFinder2 %module_info=('snellius:genoa', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /3034e72c @snellius:genoa+EESSI-2025.06
...
[       OK ] ( 9/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /80754277 @snellius:rome+EESSI-2023.06
...
[       OK ] (10/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /80754277 @snellius:genoa+EESSI-2023.06
...
[       OK ] (11/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /8f6a7e95 @snellius:rome+EESSI-2025.06
...
[       OK ] (12/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_A100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /8f6a7e95 @snellius:genoa+EESSI-2025.06
...
[       OK ] (13/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /e6d136b9 @snellius:rome+EESSI-2023.06
...
[       OK ] (14/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2023.06', 'SciPy-bundle/2023.11-gfbf-2023b') %scale=1_core /e6d136b9 @snellius:genoa+EESSI-2023.06
...
[       OK ] (15/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /bf9cd015 @snellius:rome+EESSI-2025.06
...
[       OK ] (16/16) EESSI_ModuleFinder2 %module_info=('snellius:gpu_H100', 'EESSI-2025.06', 'SciPy-bundle/2025.06-gfbf-2025a') %scale=1_core /bf9cd015 @snellius:genoa+EESSI-2025.06
...
[----------] all spawned checks have finished

@casparvl
Copy link
Author

Hm, I have to recheck how the current if...elif...elif handles sysname * and *:partname cases. My bet is, it handles them incorrectly.


_S = rf'({_NW}(:{_NW})?)' # system/partition
_VALID_SYS_SYNTAX = rf'^({_S}|{_FKV}(\s+{_FKV})*)$'
_VALID_SYS_SYNTAX = rf'^(({_FKV}\s+)*{_S}(\s+{_FKV})*|{_FKV}(\s+{_FKV})*)$'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR would benefit from a rewrite of this line.
The system definition should not rely on spaces.
All of these definitions should be valid

1. *
2. daint:gpu
3. daint:mc
4. daint:mc +gpu
5. daint:mc +gpu +uenv
6. daint:mc+gpu
7. daint:mc+gpu +uenv
8. daint:mc +gpu+uenv
9. daint:mc +gpu +uenv+modules
10. daint:mc +gpu+uenv +modules

In the current form only the first four cases are valid.
The fifth case is miss interpreted, where one cannot see the +gpu entry.
And the last five case are not valid systems.

This implies that:

  1. Tests won't run because of a space, without any hint. Users may lose a lot of time because of a missing space.
  2. Potential introduction of silent bugs. The addition of a new feature makes the previous invalid.
  3. Hinders the ability to write tests with partitions that have multiple features.

We should make this feature more robust to the end user and to the future.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

3 participants