-
Notifications
You must be signed in to change notification settings - Fork 572
Document ppc inline asm support #2056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Failed to set assignee to
|
da2b26a to
127ffae
Compare
127ffae to
0f538c9
Compare
Mark status and sticky bits of vscr, fpscr and spefscr as being preserved when using the `preserves_flags` option. These are not enforced by codegen today, but might be in future LLVM releases.
0f538c9 to
3e8ee09
Compare
| r[asm.rules.stack-above-sp] | ||
| - Unless the `nostack` option is set, assembly code is allowed to modify the caller's stack frame in specific cases. | ||
| - The target ABI requires storing certain values in the caller's frame (e.g saving the `lr` on PowerPC64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meaning to say:
When the
nostackoption is not set, assembly code is allowed to modify the caller's stack frame if and only if the target ABI requires storing values in the caller's frame (e.g., saving thelron PowerPC64).
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended it to be a sublist. I am not sure there aren't other cases which may need mentioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right; I'm asking here about content rather than factoring -- I'm checking with you my reading is correct.
Are there any cases other than when the target ABI requires values to be stored in the caller's frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter save space, if any, of the caller can be used to spill arguments on ppc64 in some cases. That's usually for C variadic functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know writing into the caller's stack frame is something that only happens on PowerPC.
| | PowerPC/PowerPC64 | `vreg` | `altivec` | `i8x16`, `i16x8`, `i32x4`, `f32x4` | | ||
| | PowerPC/PowerPC64 | `vreg` | `vsx` | `f32`, `f64`, `i64x2`, `f64x2` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source has the following FIXME linking to rust-lang/rust#131551 (comment), maybe that can be resolved?
// FIXME: vsx also supports integers?: https://github.com/rust-lang/rust/pull/131551#discussion_r1862535963
Self::vreg => types! {
altivec: VecI8(16), VecI16(8), VecI32(4), VecF32(4);
vsx: F32, F64, VecI64(2), VecF64(2);
},cc @taiki-e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integer support for vsx is quite limited. It's only 128b integers (power8) with very restricted functionality prior to power10 (notably, no divide support). Maybe that could be supported via VecI128(1)? I suspect the llvm issues remain. Is that a blocker for stabilization?
F128 is another type which should be allowed, but I think that support remains unstable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integer support for vsx is quite limited. It's only 128b integers (power8) with very restricted functionality prior to power10 (notably, no divide support). Maybe that could be supported via VecI128(1)? I suspect the llvm issues remain. Is that a blocker for stabilization?
Not a blocker I think, this just seemed like a good moment to go over the implementation while the specifics are "in ram". If you believe the rust implementation is complete given what powerpc is able to provide we're all good here.
F128 is another type which should be allowed, but I think that support remains unstable?
LLVM currently does not allow putting f128 into vreg or vsreg.
Do you think that is an oversight in LLVM?
(the f128 type is still unstable but we'd like to add support for it in the compiler where possible, see rust-lang/rust#125398)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that is an oversight in LLVM?
I think so. Gcc allows it, and I believe ELFv2 calling conventions require they be passed via vector registers. LLVM also crashed when I tried wrapping it into a vector.
As for I128. It behaves strangely using LLVM. If you don't wrap it in a vector, it gets treated like a register pair. If you wrap it in a vector and fail to set the minimum cpu to pwr8 or newer (on ppc64), it quietly generates bad code to move between gpr and vr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe our target maintainers can confirm the expected behavior here?
cc @daltenty @gilamn5tr @amy-kwan
I'd be happy to submit a patch to LLVM if we can agree on how this should work.
Stabilize ppc inline assembly This stabilizes inline assembly for PowerPC and PowerPC64. Corresponding reference PR: rust-lang/reference#2056 --- From the requirements of stabilization mentioned in #93335 > Each architecture needs to be reviewed before stabilization: > * It must have clobber_abi. Done in #146949. > * It must be possible to clobber every register that is normally clobbered by a function call. Done in #131341 Similarly, `preserves_flags` is also implemented by this PR. Likewise, there is a non-code change to `preserve_flags` expectations that floating point and vector status and sticky bits are preserved. The reference manual update has more details. > * Generally review that the exposed register classes make sense. The followings can be used as input/output: * reg (`r0`, `r[3-12]`, `r[14-r28]`): Any usable general-purpose register * reg_nonzero (`r[3-12]`, `r[14-r28]`): General-purpose registers, but excludes `r0`. This is needed for instructions which define `r0` to be the value 0, such as register + immediate memory operations. * reg/reg_nonzero `r29` on PowerPC64 targets. * freg (`f[0-31]`): 64 bit floating pointer registers The following are clobber-only: * `ctr`, `lr`, `xer`: commonly clobbered special-purpose registers used in inline asm * `cr` (`cr[0-7]`, `cr`): the condition register fields, or the entire condition register. * `vreg` (`v[0-31]`): altivec/vmx register * `vsreg` (`vs[0-63]`): vector-scalar register * `spe_acc`: SPE accumulator, only available for PowerPC SPE targets. The vreg and vsreg registers technically accept `#[repr(simd)]` types, but require the experimental `altivec` or `vsx` target features to be enabled. That work seems to be tracked here, #42743. The following cannot be used as operands for inline asm: * `r2`: the TOC pointer, required for most PIC code. * `r13`: the TLS pointer * `r[29]`: Reserved for internal usage by LLVM on PowerPC * `r[30]`: Reserved for internal usage by LLVM on PowerPC and PowerPC64 * `r31`: the frame pointer * `vrsave`: this is effectively an unused special-purpose register. The `preserves_flags` behavior is updated with the following behavior (Note, this is not enforceable today due to LLVM restrictions): * All status and sticky bits of `fpscr`, `spefscr`, and `vscr` are preserved. The following registers are unavailable: * `mma[0-7]`: These are new "registers" available on Power10, they are 512b registers which overlay 4x vsx registers. If needed, users can mark such clobbers as vsN*4, vsN*4+1,...,vsN*4+3. * `ap`: This is actually a pseudo-register in gcc/llvm. * `mq`: This register is only available on Power1 and Power2, and is not supported by llvm. --- cc @taiki-e r? @Amanieu @rustbot label +A-inline-assembly
… r=Amanieu Stabilize ppc inline assembly This stabilizes inline assembly for PowerPC and PowerPC64. Corresponding reference PR: rust-lang/reference#2056 --- From the requirements of stabilization mentioned in rust-lang#93335 > Each architecture needs to be reviewed before stabilization: > * It must have clobber_abi. Done in rust-lang#146949. > * It must be possible to clobber every register that is normally clobbered by a function call. Done in rust-lang#131341 Similarly, `preserves_flags` is also implemented by this PR. Likewise, there is a non-code change to `preserve_flags` expectations that floating point and vector status and sticky bits are preserved. The reference manual update has more details. > * Generally review that the exposed register classes make sense. The followings can be used as input/output: * reg (`r0`, `r[3-12]`, `r[14-r28]`): Any usable general-purpose register * reg_nonzero (`r[3-12]`, `r[14-r28]`): General-purpose registers, but excludes `r0`. This is needed for instructions which define `r0` to be the value 0, such as register + immediate memory operations. * reg/reg_nonzero `r29` on PowerPC64 targets. * freg (`f[0-31]`): 64 bit floating pointer registers The following are clobber-only: * `ctr`, `lr`, `xer`: commonly clobbered special-purpose registers used in inline asm * `cr` (`cr[0-7]`, `cr`): the condition register fields, or the entire condition register. * `vreg` (`v[0-31]`): altivec/vmx register * `vsreg` (`vs[0-63]`): vector-scalar register * `spe_acc`: SPE accumulator, only available for PowerPC SPE targets. The vreg and vsreg registers technically accept `#[repr(simd)]` types, but require the experimental `altivec` or `vsx` target features to be enabled. That work seems to be tracked here, rust-lang#42743. The following cannot be used as operands for inline asm: * `r2`: the TOC pointer, required for most PIC code. * `r13`: the TLS pointer * `r[29]`: Reserved for internal usage by LLVM on PowerPC * `r[30]`: Reserved for internal usage by LLVM on PowerPC and PowerPC64 * `r31`: the frame pointer * `vrsave`: this is effectively an unused special-purpose register. The `preserves_flags` behavior is updated with the following behavior (Note, this is not enforceable today due to LLVM restrictions): * All status and sticky bits of `fpscr`, `spefscr`, and `vscr` are preserved. The following registers are unavailable: * `mma[0-7]`: These are new "registers" available on Power10, they are 512b registers which overlay 4x vsx registers. If needed, users can mark such clobbers as vsN*4, vsN*4+1,...,vsN*4+3. * `ap`: This is actually a pseudo-register in gcc/llvm. * `mq`: This register is only available on Power1 and Power2, and is not supported by llvm. --- cc @taiki-e r? @Amanieu @rustbot label +A-inline-assembly
Rollup merge of #147996 - pmur:murp/stabilize-ppc-inlineasm, r=Amanieu Stabilize ppc inline assembly This stabilizes inline assembly for PowerPC and PowerPC64. Corresponding reference PR: rust-lang/reference#2056 --- From the requirements of stabilization mentioned in #93335 > Each architecture needs to be reviewed before stabilization: > * It must have clobber_abi. Done in #146949. > * It must be possible to clobber every register that is normally clobbered by a function call. Done in #131341 Similarly, `preserves_flags` is also implemented by this PR. Likewise, there is a non-code change to `preserve_flags` expectations that floating point and vector status and sticky bits are preserved. The reference manual update has more details. > * Generally review that the exposed register classes make sense. The followings can be used as input/output: * reg (`r0`, `r[3-12]`, `r[14-r28]`): Any usable general-purpose register * reg_nonzero (`r[3-12]`, `r[14-r28]`): General-purpose registers, but excludes `r0`. This is needed for instructions which define `r0` to be the value 0, such as register + immediate memory operations. * reg/reg_nonzero `r29` on PowerPC64 targets. * freg (`f[0-31]`): 64 bit floating pointer registers The following are clobber-only: * `ctr`, `lr`, `xer`: commonly clobbered special-purpose registers used in inline asm * `cr` (`cr[0-7]`, `cr`): the condition register fields, or the entire condition register. * `vreg` (`v[0-31]`): altivec/vmx register * `vsreg` (`vs[0-63]`): vector-scalar register * `spe_acc`: SPE accumulator, only available for PowerPC SPE targets. The vreg and vsreg registers technically accept `#[repr(simd)]` types, but require the experimental `altivec` or `vsx` target features to be enabled. That work seems to be tracked here, #42743. The following cannot be used as operands for inline asm: * `r2`: the TOC pointer, required for most PIC code. * `r13`: the TLS pointer * `r[29]`: Reserved for internal usage by LLVM on PowerPC * `r[30]`: Reserved for internal usage by LLVM on PowerPC and PowerPC64 * `r31`: the frame pointer * `vrsave`: this is effectively an unused special-purpose register. The `preserves_flags` behavior is updated with the following behavior (Note, this is not enforceable today due to LLVM restrictions): * All status and sticky bits of `fpscr`, `spefscr`, and `vscr` are preserved. The following registers are unavailable: * `mma[0-7]`: These are new "registers" available on Power10, they are 512b registers which overlay 4x vsx registers. If needed, users can mark such clobbers as vsN*4, vsN*4+1,...,vsN*4+3. * `ap`: This is actually a pseudo-register in gcc/llvm. * `mq`: This register is only available on Power1 and Power2, and is not supported by llvm. --- cc @taiki-e r? @Amanieu @rustbot label +A-inline-assembly
This PR is a sub-part of rust-lang/rust#147996.
r? @Amanieu