Skip to content

Conversation

@tony-davis
Copy link

@tony-davis tony-davis commented Jan 29, 2026

Summary

This PR adds compatibility for OpenMP 2.0 implementations (e.g., MSVC, older GCC) that lack functions introduced in OpenMP 3.0. BLIS currently uses omp_get_active_level() and omp_get_max_active_levels() which are not available in OpenMP 2.0, preventing compilation on Windows with MSVC and other OpenMP 2.0 environments.

Changes

frame/thread/bli_thread.c

Added OpenMP 2.0 compatibility shims that detect OpenMP version using the standard _OPENMP macro:

// OpenMP 2.0 compatibility: Provide fallbacks for OpenMP 3.0 functions
// Some compilers (e.g., MSVC, older GCC) only support OpenMP 2.0 which lacks:
//   - omp_get_active_level() (OpenMP 3.0)
//   - omp_get_max_active_levels() (OpenMP 3.0)
// The _OPENMP macro is set to version-specific values by OpenMP-compliant compilers:
//   200203 = OpenMP 2.0, 200811 = OpenMP 3.0, 201107 = OpenMP 3.1, etc.
#if defined(BLIS_ENABLE_OPENMP) && defined(_OPENMP) && _OPENMP < 200811
static inline int omp_get_active_level(void) {
    return 0;  // Always assume top-level (no nested parallelism support)
}
static inline int omp_get_max_active_levels(void) {
    return 1;  // OpenMP 2.0 doesn't support nested parallelism
}
#endif

Why _OPENMP < 200811 instead of _MSC_VER?

  • Portable: Works with any OpenMP 2.x implementation, not just MSVC
  • Future-proof: Won't break when MSVC adds OpenMP 3.0 support
  • Accurate: Detects actual OpenMP version rather than compiler vendor
  • Standard: _OPENMP is defined by all OpenMP-compliant compilers

(Credit: GitHub Copilot suggested this improvement over the original _MSC_VER check)

Testing

Environment:

  • Windows 11
  • Visual Studio 2022 (MSVC 19.50.35721.0)
  • OpenMP 2.0 (via MSVC's /openmp flag)

Results:

  • ✅ BLIS builds successfully with OpenMP threading enabled
  • ✅ ILP64 support functional
  • ✅ Zen4-optimized kernels compile and link correctly
  • ✅ No performance overhead (shims only active for OpenMP 2.0)

Compatibility

Backward Compatibility: ✅ Fully maintained

  • OpenMP 3.0+ systems: No changes (native functions used)
  • OpenMP 2.0 systems: Shims provide safe fallbacks
  • No OpenMP: No changes (shims not compiled)

Impact:

  • Nested parallelism disabled on OpenMP 2.0 systems (rare in BLAS workloads)
  • Safe defaults: single-threaded assumptions for compatibility

Checklist

  • Code follows BLIS coding style
  • Changes are minimal and focused
  • Backward compatibility maintained
  • Tested on target platform (Windows/MSVC)
  • No performance regression on existing platforms
  • Documentation inline via comments

This patch enables BLIS to build on Windows using Clang compilers with
GNU-style command-line interface (e.g., LLVM/Clang, not Clang-CL) and
adds compatibility for systems with OpenMP 2.0 implementations.

Changes:

1. CMakeLists.txt: Detect Clang with GNU frontend on Windows
   - Check CMAKE_C_COMPILER_FRONTEND_VARIANT to distinguish GNU-style
     Clang from MSVC-style Clang-CL
   - Use -fopenmp-simd for GNU-style Clang on Windows
   - Maintains backward compatibility with MSVC toolchain

2. frame/thread/bli_thread.c: Add OpenMP 2.0 compatibility shims
   - Provide fallback implementations for omp_get_active_level() and
     omp_get_max_active_levels() (OpenMP 3.0 functions)
   - Windows SDK and some legacy OpenMP implementations only support
     OpenMP 2.0
   - Shims safely disable nested parallelism (rare in BLAS workloads)
   - Zero performance overhead on systems with OpenMP 3.0+

Testing:
- Validated on Windows with TheRock's LLVM/Clang toolchain
- Builds successfully with OpenMP threading enabled
- ILP64 support functional
- Zen4-optimized kernels compile and link correctly

Rationale:
This enables BLIS to be built with consistent toolchains across Linux
and Windows platforms, supporting projects that use custom LLVM builds
rather than platform-default compilers.

Signed-off-by: Tony Davis <tony.davis@amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Copilot AI review requested due to automatic review settings January 29, 2026 19:31
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables BLIS to build on Windows using GNU-style Clang compilers (e.g., LLVM/Clang built from source) and adds compatibility for OpenMP 2.0 implementations. The changes support AMD's ROCm testing infrastructure while maintaining full backward compatibility with existing MSVC/Clang-CL toolchains.

Changes:

  • Added compiler frontend detection in CMake to distinguish GNU-style Clang from MSVC/Clang-CL on Windows
  • Added OpenMP 2.0 compatibility shims for functions introduced in OpenMP 3.0

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
CMakeLists.txt Adds detection of Clang with GNU-style frontend on Windows to use -fopenmp-simd instead of /openmp:experimental
frame/thread/bli_thread.c Provides fallback implementations of omp_get_active_level() and omp_get_max_active_levels() for OpenMP 2.0 environments

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Remove Clang-specific CMake changes (not needed for MSVC use case)
- Use _OPENMP version check instead of _MSC_VER for better portability
- _OPENMP < 200811 detects any OpenMP 2.x implementation (200203 = 2.0, 200811 = 3.0)
- More future-proof: won't break when MSVC adds OpenMP 3.0 support
- Thanks to GitHub Copilot for suggesting this improvement

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@tony-davis tony-davis changed the title Add Windows support for GNU-style Clang and OpenMP 2.0 compatibility Add Windows OpenMP 2.0 compatibility Jan 29, 2026
@tony-davis tony-davis marked this pull request as draft January 29, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant