[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 1/2][4.15] x86/PV: conditionally avoid raising #GP for early guest MSR reads
On Fri, Mar 05, 2021 at 10:50:34AM +0100, Jan Beulich wrote: > Prior to 4.15 Linux, when running in PV mode, did not install a #GP > handler early enough to cover for example the rdmsrl_safe() of > MSR_K8_TSEG_ADDR in bsp_init_amd() (not to speak of the unguarded read > of MSR_K7_HWCR later in the same function). The respective change > (42b3a4cb5609 "x86/xen: Support early interrupts in xen pv guests") was > backported to 4.14, but no further - presumably since it wasn't really > easy because of other dependencies. > > Therefore, to prevent our change in the handling of guest MSR accesses > to render PV Linux 4.13 and older unusable on at least AMD systems, make > the raising of #GP on this paths conditional upon the guest having > installed a handler, provided of course the MSR can be read in the first > place (we would have raised #GP in that case even before). Producing > zero for reads isn't necessarily correct and may trip code trying to > detect presence of MSRs early, but since such detection logic won't work > without a #GP handler anyway, this ought to be a fair workaround. > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > --- > v2: Probe MSR read. Exclude hypervisor range. Avoid issuing two log > messages (in debug builds). Don't alter WRMSR behavior. > --- > I'm not convinced we can get away without also making the WRMSR path > somewhat more permissive again, e.g. tolerating attempts to set bits > which are already set. But of course this would require keeping in sync > for which MSRs we "fake" reads, as then a kernel attempt to set a bit > may also appear as an attempt to clear others (because of the zero value > that we gave it for the read). The above approach seems dangerous, as it could allow a guest to figure out the value of the underlying MSR by probing whether values trigger a #GP? I think we want to do something similar to what we do on HVM in 4.14 and previous versions: ignore writes as long as the rdmsr to the target MSR succeeds, regardless of the value. > > --- a/xen/arch/x86/pv/emul-priv-op.c > +++ b/xen/arch/x86/pv/emul-priv-op.c > @@ -874,7 +874,7 @@ static int read_msr(unsigned int reg, ui > struct vcpu *curr = current; > const struct domain *currd = curr->domain; > const struct cpuid_policy *cp = currd->arch.cpuid; > - bool vpmu_msr = false; > + bool vpmu_msr = false, warn = false; > int ret; > > if ( (ret = guest_rdmsr(curr, reg, val)) != X86EMUL_UNHANDLEABLE ) > @@ -882,7 +882,7 @@ static int read_msr(unsigned int reg, ui > if ( ret == X86EMUL_EXCEPTION ) > x86_emul_hw_exception(TRAP_gp_fault, 0, ctxt); > > - return ret; > + goto done; > } > > switch ( reg ) > @@ -986,7 +986,7 @@ static int read_msr(unsigned int reg, ui > } > /* fall through */ > default: > - gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg); > + warn = true; > break; > > normal: > @@ -995,7 +995,19 @@ static int read_msr(unsigned int reg, ui > return X86EMUL_OKAY; > } > > - return X86EMUL_UNHANDLEABLE; > + done: Won't this handling be better placed in the 'default' switch case above? > + if ( ret != X86EMUL_OKAY && !curr->arch.pv.trap_ctxt[X86_EXC_GP].address > && > + (reg >> 16) != 0x4000 && !rdmsr_safe(reg, *val) ) We didn't used to care about explicitly blocking the reserved MSR range, do we really wan to do it now? I'm not sure I see an issue with that, but given that we are trying to bring back something similar to the previous behavior, I would avoid adding new conditions. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |