Skip to content

Conversation

@LegNeato
Copy link
Collaborator

Fixes #516.

Comment on lines +169 to +175
// Check for usage of `num_traits` intrinsics (like Float::powi) that we can optimize
if self.tcx.crate_name(def_id.krate) == self.sym.num_traits && !def_id.is_local() {
let item_name = self.tcx.item_name(def_id);
if let Some(&intrinsic) = self.sym.libm_intrinsics.get(&item_name) {
self.libm_intrinsics.borrow_mut().insert(def_id, intrinsic);
}
}
Copy link
Contributor

@nazar-pc nazar-pc Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how things work more generally, but why are things like this even necessary? I'd expect the generic machinery of the compiler to optimize it to some canonical form that the target would like the most.

So it shouldn't make any difference if one writes x.powi(2) or x * x or uses a few layers of zero-cost abstractions in the process, should all result in identical SPIR-V, just like it does with LLVM. What am I missing?

Copy link
Collaborator Author

@LegNeato LegNeato Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is we are using num_traits to get the APIs we need off primitives, and num_traits knows nothing about the intrinsics available on the platform. It is using a loop as impl https://github.com/rust-num/num-traits/blob/master/src/pow.rs#L179. Pretty much no compiler would be able to tell that code can be replaced with a cast and a powi, and rustc doesn't. Instead somewhere needs to map the call / the semantics to the underlying fast impl, which is what the backend is supposed to do. It already does it for many ops, we just missed a spot (I guess? This all predates me).

One can argue we shouldn't use num_traits and instead should just hang our own impls off the types, I have no clue why it was chosen vs that route. Probably a make it work first then make it fast tradeoff (as the num_traits impl does work on spirv).

I have a new optimizer where we can control things much more rather than relying on spirv-opt, but this replacement happens before the optimizer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, that definitely looks like a problem. Maybe worth creating an issue to look into getting rid of num_traits? Doing things like looping instead of native exponentiation is definitely going to hurt and there might be more examples like it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

powi is slow

2 participants