[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: xen 4.15.5: msr_relaxed required for MSR 0x1a2
On Mon, Nov 20, 2023 at 08:27:36AM +0000, James Dingwall wrote: > On Fri, Nov 17, 2023 at 10:56:30AM +0100, Jan Beulich wrote: > > On 17.11.2023 10:18, James Dingwall wrote: > > > On Thu, Nov 16, 2023 at 04:32:47PM +0000, Andrew Cooper wrote: > > >> On 16/11/2023 4:15 pm, James Dingwall wrote: > > >>> Hi, > > >>> > > >>> Per the msr_relaxed documentation: > > >>> > > >>> "If using this option is necessary to fix an issue, please report a > > >>> bug." > > >>> > > >>> After recently upgrading an environment from Xen 4.14.5 to Xen 4.15.5 we > > >>> started experiencing a BSOD at boot with one of our Windows guests. We > > >>> found > > >>> that enabling `msr_relaxed = 1` in the guest configuration has resolved > > >>> the > > >>> problem. With a debug build of Xen and `hvm_debug=2048` on the command > > >>> line > > >>> the following messages were caught as the BSOD happened: > > >>> > > >>> (XEN) [HVM:11.0] <vmx_msr_read_intercept> ecx=0x1a2 > > >>> (XEN) vmx.c:3298:d11v0 RDMSR 0x000001a2 unimplemented > > >>> (XEN) d11v0 VIRIDIAN CRASH: 1e ffffffffc0000096 fffff80b8de81eb5 0 0 > > >>> > > >>> I found that MSR 0x1a2 is MSR_TEMPERATURE_TARGET and from that this > > >>> patch > > >>> series from last month: > > >>> > > >>> https://patchwork.kernel.org/project/xen-devel/list/?series=796550 > > >>> > > >>> Picking out just a small part of that fixes the problem for us. > > >>> Although the > > >>> the patch is against 4.15.5 I think it would be relevant to more recent > > >>> releases too. > > >> > > >> Which version of Windows, and what hardware? > > >> > > >> The Viridian Crash isn't about the RDMSR itself - it's presumably > > >> collateral damage shortly thereafter. > > >> > > >> Does filling in 0 for that MSR also resolve the issue? It's model > > >> specific and we absolutely cannot pass it through from real hardware > > >> like that. > > >> > > > > > > Hi Andrew, > > > > > > Thanks for your response. The guest is running Windows 10 and the crash > > > happens in a proprietary hardware driver. A little bit of knowledge as > > > they say was enough to stop the crash but I don't understand the impact > > > of what I've actually done... > > > > > > To rework the patch I'd need a bit of guidance, if I understand your > > > suggestion I set the MSR to 0 with this change in emul-priv-op.c: > > > > For the purpose of the experiment suggested by Andrew ... > > > > > diff --git a/xen/arch/x86/pv/emul-priv-op.c > > > b/xen/arch/x86/pv/emul-priv-op.c > > > index ed97b1d6fcc..66f5e417df6 100644 > > > --- a/xen/arch/x86/pv/emul-priv-op.c > > > +++ b/xen/arch/x86/pv/emul-priv-op.c > > > @@ -976,6 +976,10 @@ static int read_msr(unsigned int reg, uint64_t *val, > > > *val = 0; > > > return X86EMUL_OKAY; > > > > > > + case MSR_TEMPERATURE_TARGET: > > > + *val = 0; > > > + return X86EMUL_OKAY; > > > + > > > case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7): > > > case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3): > > > case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2: > > > > ... you wouldn't need this (affects PV domains only), and ... > > > > > and this in vmx.c: > > > > > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > > > index 54023a92587..bbf37b7f272 100644 > > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > > @@ -3259,6 +3259,11 @@ static int vmx_msr_read_intercept(unsigned int > > > msr, uint64_t *msr_content) > > > if ( !nvmx_msr_read_intercept(msr, msr_content) ) > > > goto gp_fault; > > > break; > > > + > > > + case MSR_TEMPERATURE_TARGET: > > > + *msr_content = 0; > > > + break; I think the preference now is to add such handling directly in guest_rdmsr()? Protected with a: if ( !(cp->x86_vendor & (X86_VENDOR_INTEL)) ) goto gp_fault; Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |