[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 1/2][4.15] x86/PV: conditionally avoid raising #GP for early guest MSR reads
On 08.03.2021 09:56, Roger Pau Monné wrote: > On Fri, Mar 05, 2021 at 10:50:34AM +0100, Jan Beulich wrote: >> Prior to 4.15 Linux, when running in PV mode, did not install a #GP >> handler early enough to cover for example the rdmsrl_safe() of >> MSR_K8_TSEG_ADDR in bsp_init_amd() (not to speak of the unguarded read >> of MSR_K7_HWCR later in the same function). The respective change >> (42b3a4cb5609 "x86/xen: Support early interrupts in xen pv guests") was >> backported to 4.14, but no further - presumably since it wasn't really >> easy because of other dependencies. >> >> Therefore, to prevent our change in the handling of guest MSR accesses >> to render PV Linux 4.13 and older unusable on at least AMD systems, make >> the raising of #GP on this paths conditional upon the guest having >> installed a handler, provided of course the MSR can be read in the first >> place (we would have raised #GP in that case even before). Producing >> zero for reads isn't necessarily correct and may trip code trying to >> detect presence of MSRs early, but since such detection logic won't work >> without a #GP handler anyway, this ought to be a fair workaround. >> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >> --- >> v2: Probe MSR read. Exclude hypervisor range. Avoid issuing two log >> messages (in debug builds). Don't alter WRMSR behavior. >> --- >> I'm not convinced we can get away without also making the WRMSR path >> somewhat more permissive again, e.g. tolerating attempts to set bits >> which are already set. But of course this would require keeping in sync >> for which MSRs we "fake" reads, as then a kernel attempt to set a bit >> may also appear as an attempt to clear others (because of the zero value >> that we gave it for the read). > > The above approach seems dangerous, as it could allow a guest to > figure out the value of the underlying MSR by probing whether values > trigger a #GP? Perhaps, yes. But what do you do? There's potentially a huge value range to probe ... > I think we want to do something similar to what we do on HVM in 4.14 > and previous versions: ignore writes as long as the rdmsr to the > target MSR succeeds, regardless of the value. Which, as said elsewhere, has its own downsides - writable MSRs don't need to also be readable. See e.g. AMD's proposed PARTIAL_{FS,GS}_LOAD MSRs. >> --- a/xen/arch/x86/pv/emul-priv-op.c >> +++ b/xen/arch/x86/pv/emul-priv-op.c >> @@ -874,7 +874,7 @@ static int read_msr(unsigned int reg, ui >> struct vcpu *curr = current; >> const struct domain *currd = curr->domain; >> const struct cpuid_policy *cp = currd->arch.cpuid; >> - bool vpmu_msr = false; >> + bool vpmu_msr = false, warn = false; >> int ret; >> >> if ( (ret = guest_rdmsr(curr, reg, val)) != X86EMUL_UNHANDLEABLE ) >> @@ -882,7 +882,7 @@ static int read_msr(unsigned int reg, ui >> if ( ret == X86EMUL_EXCEPTION ) >> x86_emul_hw_exception(TRAP_gp_fault, 0, ctxt); >> >> - return ret; >> + goto done; >> } >> >> switch ( reg ) >> @@ -986,7 +986,7 @@ static int read_msr(unsigned int reg, ui >> } >> /* fall through */ >> default: >> - gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg); >> + warn = true; >> break; >> >> normal: >> @@ -995,7 +995,19 @@ static int read_msr(unsigned int reg, ui >> return X86EMUL_OKAY; >> } >> >> - return X86EMUL_UNHANDLEABLE; >> + done: > > Won't this handling be better placed in the 'default' switch case > above? No - see the "goto done" added near the top of the function. >> + if ( ret != X86EMUL_OKAY && >> !curr->arch.pv.trap_ctxt[X86_EXC_GP].address && >> + (reg >> 16) != 0x4000 && !rdmsr_safe(reg, *val) ) > > We didn't used to care about explicitly blocking the reserved MSR > range, do we really wan to do it now? > > I'm not sure I see an issue with that, but given that we are trying to > bring back something similar to the previous behavior, I would avoid > adding new conditions. What I'm particularly trying to avoid here is to allow information from an underlying hypervisor to "shine through", even if it's only information as to whether a certain MSR is readable. It should be solely our own selection which MSRs in this range a guest is able to (appear to) read. Plus of course by avoiding the rdmsr_safe() in this case we also avoid the unnecessary (debug only) log message associated with such attempted reads. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |