[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] Xen/MCA: bugfix for mca bank clear


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
  • Date: Thu, 28 Feb 2013 08:57:01 +0000
  • Accept-language: en-US
  • Cc: "Ren, Yongjie" <yongjie.ren@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Thu, 28 Feb 2013 08:57:26 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHOFYpkXArdsC0/T2GFEhh+i5sJSZiO87bQ
  • Thread-topic: [PATCH 1/2] Xen/MCA: bugfix for mca bank clear

Jan Beulich wrote:
>>>> On 27.02.13 at 17:44, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>> Xen/MCA: bugfix for mca bank clear
>> 
>> mcabank_clear() is common code for real h/w mca and s/w simulated
>> mca. Under s/w simulated case, getting status via mca_rdmsr may
>> trigger #GP if MCx_ADDR/MISC not supported by real h/w.
>> 
>> This patch fix the bug. It always invalidates intpose for s/w
>> simulated mca case, and do check real h/w status ADDRV/MISCV to
>> avoid #GP when clear MCx_ADDR/MISC for h/w mca case.
>> 
>> Reported-by: Ren Yongjie <yongjie.ren@xxxxxxxxx>
>> Singed-off-by: Liu Jinsong <jinsong.liu@xxxxxxxxx>
>> 
>> diff -r e84a79d11d7a xen/arch/x86/cpu/mcheck/mce.c
>> --- a/xen/arch/x86/cpu/mcheck/mce.c  Thu Nov 01 01:41:03 2012 +0800
>> +++ b/xen/arch/x86/cpu/mcheck/mce.c  Thu Feb 28 08:05:15 2013 +0800
>>      @@ -138,17 +138,28 @@ xfree(banks);
>>  }
>> 
>> +/*
>> + * Common code for real h/w mca and s/w simulated mca.
>> + * Always invalidate intpose for s/w simulated mca case, and do
>> check + * real h/w status ADDRV/MISCV to avoid #GP when clear
>> MCx_ADDR/MISC + * for h/w mca case. + */
>>  static void mcabank_clear(int banknum)
>>  {
>> -    uint64_t status;
>> +    uint64_t hw_status;
>> 
>> -    status = mca_rdmsr(MSR_IA32_MCx_STATUS(banknum));
>> +    /* For s/w simulated mca case */
>> +    intpose_inval(smp_processor_id(), MSR_IA32_MCx_ADDR(banknum));
>> +    intpose_inval(smp_processor_id(), MSR_IA32_MCx_MISC(banknum));
> 
> And no invalidation of MSR_IA32_MCx_STATUS(banknum)?
> 
>> 
>> -    if (status & MCi_STATUS_ADDRV)
>> -        mca_wrmsr(MSR_IA32_MCx_ADDR(banknum), 0x0ULL);
>> -    if (status & MCi_STATUS_MISCV)
>> -        mca_wrmsr(MSR_IA32_MCx_MISC(banknum), 0x0ULL);
>> +    /* For real h/w mca case */
>> +    rdmsrl(MSR_IA32_MCx_STATUS(banknum), hw_status);
>> +    if (hw_status & MCi_STATUS_ADDRV)
>> +        wrmsrl(MSR_IA32_MCx_ADDR(banknum), 0x0ULL);
>> +    if (hw_status & MCi_STATUS_MISCV)
>> +        wrmsrl(MSR_IA32_MCx_MISC(banknum), 0x0ULL);
>> 
>> +    /* For both cases */
>>      mca_wrmsr(MSR_IA32_MCx_STATUS(banknum), 0x0ULL);
> 
> Oh, that happens as a side effect of this one. Not really obvious,
> so likely worth a comment.

OK.

> 
>>  }
>> 
> 
> But anyway, I don't think that's quite right either: For one, I'm not
> sure the ordering of the MSR writes can be freely exchanged
> (originally you cleared STATUS first, while now it's done last).

OK, adjust sequence strictly according to SDM.

> 
> And then I don't see why you would do physical MSR accesses in
> the injection case at all. It should really be mca_wrmsr() to take
> care of this, it just needs to have a way to know it got called in
> the context of an injection. One fundamental question here is
> why the invalidation of the interposed data is being done
> before the MSR write rather than after. One thing we surely
> don't care much about in the injection logic is interference with
> a real #MC.
> 
> Jan

A way (say, a global bool flag) could be used to indicate s/w injection.
Our concern here is, if s/w inject occur while real h/w mce, it may be race 
condition.
But if you think it's OK to tolerate it, it's OK for me.

Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.