[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Xen/MCE: adjust for future new vMCE model



Jan Beulich wrote:
>>>> On 05.07.12 at 05:30, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>> wrote: @@ -68,7 +74,7 @@ 
>> 
>>  int vmce_restore_vcpu(struct vcpu *v, uint64_t caps)  {
>> -    if ( caps & ~g_mcg_cap & ~MCG_CAP_COUNT & ~MCG_CTL_P )
>> +    if ( caps & ~g_mcg_cap & ~MCG_CAP_COUNT )
>>      {
>>          dprintk(XENLOG_G_ERR, "%s restore: unsupported MCA
>>                  capabilities" " %#" PRIx64 " for d%d:v%u
>>          (supported: %#Lx)\n", @@ -77,6 +83,12 @@ return -EPERM;
>>      }
>> 
>> +    if ( (caps & MCG_CAP_COUNT) != GUEST_BANK_NUM ) +    {
>> +        dprintk(XENLOG_G_ERR, "Bank number inconsistency\n"); +    
>> return -EPERM; +    }
>> +
>>      v->arch.mcg_cap = caps;
>>      return 0;
>>  }
> 
> These two changes, as pointed out before, need some further
> thought and tweaking: As I said earlier, we are already shipping
> the code in its prior form, so outright rejecting MCG_CTL_P set
> and non-default bank counts is not a proper solution. Warning
> about them being in an unsupported state is certainly acceptable.
> 
> And I think the guest visible MCG_CAP value also shouldn't
> change across migration, so tolerating (but ignoring) a higher
> bank count seems necessary. Not sure what to do when the
> bank count is lower (0 or 1) - for 1, all communication to the
> guest should probably go through bank 0, while 0 should
> probably disable vMCE  for that vCPU.
> 
> Further I just noticed that you don't touch fill_vmsr_data() at
> all (sending patches created with diff's -p option or equivalent
> helps spotting where individual changes belong), yet that
> function uses the hardware bank number to fill the struct
> bank_entry. With the intended concept, the "bank" member
> of that structure can probably go away altogether.
> 
> Jan


Seems things become a little bit chaos, let's jump out from details, make 
agreement first about what's the general purpose of this middle-work patch.

============
1. current xen vmce status
  1.1) current xen vmce has 2 kinds of bugs:
    * functional bug: it actually doesn't work correctly for guest;
    * migration bug: partly solved by c/s 24887;
  1.2) c/s 24887 not in formal xen release version, it's a temporary fix (but 
unfortunately has been backported to SELS11 sp2)
============
2. target of this middle-work patch
  2.1) it's not used to address functional bug
  2.3) it does minimal work, just in order not to bring trouble/dirty to future 
new vMCE, so it would
    * remove MCG_CTL --> otherwise new vMCE have to add useless MCG_CTL_P and 
MCG_CTL, w/ potential model specific issue;
    * stick all 1's to MCi_CTL --> to avoid semantic issue;
    * set MCG_CTL_P=0 and 2 banks at MCG_CAP --> otherwise dirty code at new 
vMCE, like bank number issue;
============

I think in this middle-work patch, we don't need constrained by c/s 24887:
1. c/s 24887 not in formal xen release
2. the benefit of keep compatible w/ 24887 is minor:
  * currently all xen version have migration bug, 3.x -> 4.0 -> 4.1 -> 4.2
  * keep compatible w/ 24887 only benefit one case --> migration from SLES11 
sp2 to xen 4.2 (for both SLES11 sp2 and xen 4.2, vMCE actually doesn't 
functionally work fine to guest)
3. the price/hurt to future vMCE is high (to keep compatible w/ 24887)

Considering above, I prefer an outright cleanup in xen4.2, not constrained by 
c/s 24887 and not bring trouble/dirty to future vMCE.

Thanks,
Jinsong






_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.