[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 for-4.15] x86/msr: introduce an option for compatible MSR behavior selection
On 09/03/2021 11:36, Jan Beulich wrote: > On 09.03.2021 11:56, Roger Pau Monne wrote: >> Introduce an option to allow selecting a behavior similar to the pre >> Xen 4.15 one for accesses to MSRs not explicitly handled. Since commit >> 84e848fd7a162f669 and 322ec7c89f6640e accesses to MSRs not explicitly >> handled by Xen result in the injection of a #GP to the guest. This >> is a behavior change since previously a #GP was only injected if >> accessing the MSR on the real hardware would also trigger a #GP, or if >> the attempted to be set bits wouldn't match the hardware values (for >> PV). >> >> This seems to be problematic for some guests, so introduce an option >> to fallback to this kind of legacy behavior without leaking the >> underlying MSR values to the guest. >> >> When the option is set, for both PV and HVM don't inject a #GP to the >> guest on MSR read if reading the underlying MSR doesn't result in a >> #GP, do the same for writes and simply discard the value to be written >> on that case. >> >> Note that for guests restored or migrated from previous Xen versions >> the option is enabled by default, in order to keep a compatible >> MSR behavior. Such compatibility is done at the libxl layer, to avoid >> higher-level toolstacks from having to know the details about this flag. >> >> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > I'm generally okay with this approach, but wouldn't want to give it > my R-b until it's at least clear it's not entirely unacceptable to > anyone else (Andrew in particular). I'm broadly happy with this approach. Some feedback just sent. > Couple of remarks: > >> Changes since v2: >> - Apply the option to both HVM and PV guest. >> - Handle both reads and writes. > I take it that it's intentional that you didn't allow separate read > and write control? I think v3 is the correct approach. This is strictly a backwards compatibility option. Splitting read and write just doubles the complexity the admin has wrestle with to recover a legacy guest. As we explicitly intend to retire the option in due course, this is very much a "make my guest work until my upstream (either Xen or kernel) can fix the bug". > >> --- a/xen/arch/x86/hvm/svm/svm.c >> +++ b/xen/arch/x86/hvm/svm/svm.c >> @@ -1795,6 +1795,7 @@ static int svm_msr_read_intercept(unsigned int msr, >> uint64_t *msr_content) >> const struct domain *d = v->domain; >> struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb; >> const struct nestedsvm *nsvm = &vcpu_nestedsvm(v); >> + uint64_t tmp; >> >> switch ( msr ) >> { > While to some degree a matter of taste, I think such variables needed > only inside a switch() and not needing an initializer would better > live in the respective switch()'s scope. Actually, I was hoping to make a CODING_SYTLE change removing this as a permitted construct. Recent compilers have hardening features to automatically initialise all stack variables, because even if your code isn't architecturally buggy, an attacker can launch speculative attacks the stack rubble. However, because of the way the compiler transform works, it cannot tolerate this specific construct in a switch statement, as there is no available execution to cope with the implicit "=0" or "=POISON". Even within Xen, we don't have many examples of this construct AFAICT, and there is a concrete security advantage to being able to support the compiler hardening. >> --- a/xen/arch/x86/pv/emul-priv-op.c >> +++ b/xen/arch/x86/pv/emul-priv-op.c >> @@ -875,6 +875,7 @@ static int read_msr(unsigned int reg, uint64_t *val, >> const struct domain *currd = curr->domain; >> const struct cpuid_policy *cp = currd->arch.cpuid; >> bool vpmu_msr = false; >> + uint64_t tmp; >> int ret; >> >> if ( (ret = guest_rdmsr(curr, reg, val)) != X86EMUL_UNHANDLEABLE ) >> @@ -986,6 +987,12 @@ static int read_msr(unsigned int reg, uint64_t *val, >> } >> /* fall through */ >> default: >> + if ( currd->arch.msr_relaxed && !rdmsr_safe(reg, tmp) ) >> + { >> + *val = 0; >> + return X86EMUL_OKAY; >> + } >> + >> gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg); >> break; >> >> @@ -1148,6 +1155,9 @@ static int write_msr(unsigned int reg, uint64_t val, >> } >> /* fall through */ >> default: >> + if ( currd->arch.msr_relaxed && !rdmsr_safe(reg, val) ) >> + return X86EMUL_OKAY; >> + >> gdprintk(XENLOG_WARNING, >> "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n", >> reg, val); > So what are your thoughts wrt my change to this file? Drop it > altogether and require people to use this new option? Or do you > see both coexist? In the latter case, since you had suggested > that I drop the write side of my change - does your changing of > the write path indicate you've changed your mind? I don't think we should legitimise buggy PV behaviour, either by codifying in the ABI, or providing a knob beyond this one. A guest blindly stumbling forward with a missed #GP could very well worse than crashing hard. Case in point - the 4.15 behaviour spotted a very serious bug in NetBSD where it tried writing MSR_PAT with its own choice (which wasn't the same as Xen's choice). The consequence of this bug is getting cache attributes in pagetables wrong. It is unfortunate that multiple bugs have combined to make this mess, but every instance needs investigating and fixing. Continuing to paper over the hole doesn't help anyone in the long run. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |