[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 03/22] x86/xstate: re-size save area when CPUID policy changes
On 22/04/2021 15:44, Jan Beulich wrote: > vCPU-s get maximum size areas allocated initially. Hidden (and in > particular default-off) features may allow for a smaller size area to > suffice. > > Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > --- > v2: Use 1ul instead of 1ull. Re-base. > --- > This could be further shrunk if we used XSAVEC / if we really used > XSAVES, as then we don't need to also cover the holes. But since we > currently use neither of the two in reality, this would require more > work than just adding the alternative size calculation here. > > Seeing that both vcpu_init_fpu() and cpuid_policy_updated() get called > from arch_vcpu_create(), I'm not sure we really need this two-stage > approach - the slightly longer period of time during which > v->arch.xsave_area would remain NULL doesn't look all that problematic. > But since xstate_alloc_save_area() gets called for idle vCPU-s, it has > to stay anyway in some form, so the extra code churn may not be worth > it. > > Instead of cpuid_policy_xcr0_max(), cpuid_policy_xstates() may be the > interface to use here. But it remains to be determined whether the > xcr0_accum field is meant to be inclusive of XSS (in which case it would > better be renamed) or exclusive. Right now there's no difference as we > don't support any XSS-controlled features. I've been figuring out what we need to for supervisors states. The current code is not in a good shape, but I also think some of the changes in this series take us in an unhelpful direction. I've got a cleanup series which I will post shortly. It interacts texturally although not fundamentally with this series, but does fix several issues. For supervisor states, we need use XSAVES unilaterally, even for PV. This is because XSS_CET_S needs to be the HVM kernel's context, or Xen's in PV context (specifically, MSR_PL0_SSP which is the shstk equivalent of TSS.RSP0). A consequence is that Xen's data handling shall use the compressed format, and include supervisor states. (While in principle we could manage CET_S, CET_U, and potentially PT when vmtrace gets expanded, each WRMSR there is a similar order of magnitude to an XSAVES/XRSTORS instruction.) I'm planning a host xss setting, similar to mmu_cr4_features, which shall be the setting in context for everything other than HVM vcpus (which need the guest setting in context, and/or the VT-x bodge to support host-only states). Amongst other things, all context switch paths, including from-HVM, need to step XSS up to the host setting to let XSAVES function correctly. However, a consequence of this is that the size of the xsave area needs deriving from host, as well as guest-max state. i.e. even if some VMs aren't using CET, we still need space in the xsave areas to function correctly when a single VM is using CET. Another consequence is that we need to rethink our hypercall behaviour. There is no such thing as supervisor states in an uncompressed XSAVE image, which means we can't continue with that being the ABI. I've also found some substantial issues with how we handle xcr0/xcr0_accum and plan to address these. There is no such thing as xcr0 without the bottom bit set, ever, and xcr0_accum needs to default to X87|SSE seeing as that's how we use it anyway. However, in a context switch, I expect we'll still be using xcr0_accum | host_xss when it comes to the context switch path. In terms of actual context switching, we want to be using XSAVES/XRSTORS whenever it is available, even if we're not using supervisor states. XSAVES has both the inuse and modified optimisations, without the broken consequence of XSAVEOPT (which is firmly in the "don't ever use this" bucket now). There's no point ever using XSAVEC. There is no hardware where it exists in the absence of XSAVES, and can't even in theoretical circumstances due to (perhaps unintentional) linkage of the CPUID data. XSAVEC also doesn't use the modified optimisation, and is therefore strictly worse than XSAVES, even when MSR_XSS is 0. Therefore, our choice of instruction wants to be XSAVES, or XSAVE, or FXSAVE, depending on hardware capability. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |