|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/2] x86/entry: Introduce POP_GPRS
On 13.03.2024 15:26, Andrew Cooper wrote:
> The macro named RESTORE_ALL has several problems. It adjusts the stack
> pointer despite this not being clear to the caller. It also goes against
> recommendations in the optimisation guides because of trying to do too many
> things at once. (i.e. there's a reason why compilers don't emit code looking
> like this.)
Not anymore; I'm sure they used to over a certain period of time, which is
why 4d246723a85a ("x86: use MOV instead of PUSH/POP when saving/restoring
register state") was created in the first place (and which you now say was
a mistake, or at least has become a mistake in the over 10 years since then).
> Introduce a new POP_GPRS macro which only POPs GPRs. Use it for the HVM paths
> which are already using POPs.
>
> Also use it for restore_all_{xen,guest}(). This saves most of a cacheline
> worth of code from two fastpaths:
>
> add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-99 (-99)
> Function old new delta
> restore_all_guest 378 330 -48
> restore_all_xen 165 114 -51
>
> but it also avoids having an explicit modification to the stack pointer
> between %rsp-relative accesses, which avoids stalls in the stack-engine
> optimisations in some microarchitectures.
Is there such a rule? All I was able to find (and even that only with
quite a bit of effort, because the section it's in wouldn't have had
me think of a stack pointer rule being there) is
"Assembly/Compiler Coding Rule 22. (M impact, M generality) Avoid
putting explicit references to ESP in a sequence of stack operations
(POP, PUSH, CALL, RET)."
I actually wonder whether %rsp really is special in the way you
indicate - other registers, when used for memory accesses and being
updated ought to have a similar stall issue?
> --- a/xen/arch/x86/hvm/svm/entry.S
> +++ b/xen/arch/x86/hvm/svm/entry.S
> @@ -74,22 +74,9 @@ __UNLIKELY_END(nsvm_hap)
> ALTERNATIVE "", svm_vmentry_spec_ctrl, X86_FEATURE_SC_MSR_HVM
> ALTERNATIVE "", DO_SPEC_CTRL_DIV, X86_FEATURE_SC_DIV
>
> - pop %r15
> - pop %r14
> - pop %r13
> - pop %r12
> - pop %rbp
> mov VCPU_svm_vmcb_pa(%rbx),%rax
> - pop %rbx
> - pop %r11
> - pop %r10
> - pop %r9
> - pop %r8
> - pop %rcx /* Skip %rax: restored by VMRUN. */
> - pop %rcx
> - pop %rdx
> - pop %rsi
> - pop %rdi
> +
> + POP_GPRS rax=%rcx /* Skip %rax. It's restored by VMRUN. */
In light of you having asked my to try and decouple ABI and internal stack
frame layout, I'm wary of encoding a dependency on the ordering of registers
in the frame at a use site like this one. Imo the argument ought to merely
indicate "skip %rax", with the macro taking care of how that skipping is
actually carried out.
> @@ -696,20 +697,19 @@ UNLIKELY_END(exit_cr3)
> /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
> SPEC_CTRL_EXIT_TO_XEN /* Req: %r12=ist_exit %r14=end %rsp=regs,
> Clob: abcd */
>
> - RESTORE_ALL adj=8
> + POP_GPRS
>
> /*
> * When the CPU pushed this exception frame, it zero-extended eflags.
> * For an IST exit, SPEC_CTRL_EXIT_TO_XEN stashed shadow copies of
> * spec_ctrl_flags and ver_sel above eflags, as we can't use any
> GPRs,
> * and we're at a random place on the stack, not in a CPUFINFO block.
> - *
> - * Account for ev/ec having already been popped off the stack.
> */
> SPEC_CTRL_COND_VERW \
> - scf=STK_REL(EFRAME_shadow_scf, EFRAME_rip), \
> - sel=STK_REL(EFRAME_shadow_sel, EFRAME_rip)
> + scf=STK_REL(EFRAME_shadow_scf, EFRAME_error_code), \
> + sel=STK_REL(EFRAME_shadow_sel, EFRAME_error_code)
>
> + add $8, %rsp
> iretq
How is this ADD different from the RESTORE_ALL one, in particular in light
of the ORM rule quoted above (which surely extends to IRET as well)? It
ought to be possible to avoid, by having POP_GPRS (optionally) move the
%r15 value into the error code slot first thing (i.e. before %rsp starts
being updated), and then having "pop %r15" last.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |