[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per partition



> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 11 January 2021 09:00
> To: paul@xxxxxxx
> Cc: wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; anthony.perard@xxxxxxxxxx; 
> andrew.cooper3@xxxxxxxxxx;
> george.dunlap@xxxxxxxxxx; julien@xxxxxxx; sstabellini@xxxxxxxxxx; 
> roger.pau@xxxxxxxxxx; xen-
> devel@xxxxxxxxxxxxxxxxxxxx; 'Igor Druzhinin' <igor.druzhinin@xxxxxxxxxx>
> Subject: Re: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per 
> partition
> 
> On 11.01.2021 09:45, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> >> Sent: 09 January 2021 00:56
> >> To: paul@xxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx
> >> Cc: wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; anthony.perard@xxxxxxxxxx; 
> >> andrew.cooper3@xxxxxxxxxx;
> >> george.dunlap@xxxxxxxxxx; jbeulich@xxxxxxxx; julien@xxxxxxx; 
> >> sstabellini@xxxxxxxxxx;
> >> roger.pau@xxxxxxxxxx
> >> Subject: Re: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per 
> >> partition
> >>
> >> On 08/01/2021 08:32, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> >>>> Sent: 08 January 2021 00:47
> >>>> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> >>>> Cc: paul@xxxxxxx; wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; 
> >>>> anthony.perard@xxxxxxxxxx;
> >>>> andrew.cooper3@xxxxxxxxxx; george.dunlap@xxxxxxxxxx; jbeulich@xxxxxxxx; 
> >>>> julien@xxxxxxx;
> >>>> sstabellini@xxxxxxxxxx; roger.pau@xxxxxxxxxx; Igor Druzhinin 
> >>>> <igor.druzhinin@xxxxxxxxxx>
> >>>> Subject: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per 
> >>>> partition
> >>>>
> >>>> TLFS 7.8.1 stipulates that "a virtual processor index must be less than
> >>>> the maximum number of virtual processors per partition" that "can be 
> >>>> obtained
> >>>> through CPUID leaf 0x40000005". Furthermore, "Requirements for 
> >>>> Implementing
> >>>> the Microsoft Hypervisor Interface" defines that starting from Windows 
> >>>> Server
> >>>> 2012, which allowed more than 64 CPUs to be brought up, this leaf can now
> >>>> contain a value -1 basically assuming the hypervisor has no restriction 
> >>>> while
> >>>> 0 (that we currently expose) means the default restriction is still 
> >>>> present.
> >>>>
> >>>> Along with previous changes exposing ExProcessorMasks this allows a 
> >>>> recent
> >>>> Windows VM with Viridian extension enabled to have more than 64 vCPUs 
> >>>> without
> >>>> going into immediate BSOD.
> >>>>
> >>>
> >>> This is very odd as I was happily testing with a 128 vCPU VM once 
> >>> ExProcessorMasks was
> >> implemented... no need for the extra leaf.
> >>> The documentation for 0x40000005 states " Describes the scale limits 
> >>> supported in the current
> >> hypervisor implementation. If any
> >>> value is zero, the hypervisor does not expose the corresponding 
> >>> information". It does not say (in
> >> section 7.8.1 or elsewhere AFAICT)
> >>> what implications that has for VP_INDEX.
> >>>
> >>> In " Requirements for Implementing the Microsoft Hypervisor Interface" I 
> >>> don't see anything saying
> >> what the semantics of not
> >>> implementing leaf 0x40000005 are, only that if implementing it then -1 
> >>> must be used to break the
> 64
> >> VP limit. It also says that
> >>> reporting -1 in 0x40000005 means that 40000004.EAX bits 1 and 2 *must* be 
> >>> clear, which is clearly
> >> not true if ExProcessorMasks is
> >>> enabled. That document is dated June 13th 2012 so I think it should be 
> >>> taken with a pinch of salt.
> >>>
> >>> Have you actually observed a BSOD with >64 vCPUs when ExProcessorMasks is 
> >>> enabled? If so, which
> >> version of Windows? I'd like to get
> >>> a repro myself.
> >>
> >> I return with more testing that confirm both my and your results.
> >>
> >> 1) with ExProcessorMask and 66 vCPUs assigned both current WS19 build and
> >> pre-release 20270 of vNext boot correctly with no changes
> >>
> >> and that would be fine but the existing documentation for 
> >> ex_processor_masks
> >> states that:
> >> "Hence this enlightenment must be specified for guests with more
> >> than 64 vCPUs if B<hcall_remote_tlb_flush> and/or B<hcall_ipi> are also
> >> specified."
> >>
> >> You then would expect 64+ vCPU VM to boot without ex_processors_mask,
> >> hcall_remote_tlb_flush and hcall_ipi.
> >
> > Indeed.
> >
> >>
> >> So,
> >> 2) without ExProcessorMaks and 66 vCPUs assigned and with 
> >> hcall_remote_tlb_flush
> >> and hcall_ipi disabled: WS19 BSODs and vNext fails to initialize secondary 
> >> CPUs
> >>
> >
> > This is not what I recall from testing but I can confirm I see the same now.
> >
> >> After applying my change,
> >> 3) without ExProcessorMaks and 66 vCPUs assigned and with 
> >> hcall_remote_tlb_flush
> >> and hcall_ipi disabled WS19 and vNext boot correctly
> >>
> >> So we either need to change documentation and require ExProcessorMasks for 
> >> all
> >> VMs with 64+ vCPUs or go with my change.
> >>
> >
> > I think we go with your change. I guess we can conclude then that the 
> > separate viridian flag as part
> of the base set is the right way to go (to stop the leaf magically appearing 
> on migrate of existing
> guests) and leave ex_processor_masks (and the documentation) as-is.
> >
> > You can add my R-b to the patch.
> 
> That's the unchanged patch then, including the libxl change that
> I had asked about and that I have to admit I don't fully follow
> Igor's responses? I'm hesitant to give an ack for that aspect of
> the change, yet I suppose the libxl maintainers will defer to
> x86 ones there. Alternatively Andrew or Roger could of course
> ack this ...
> 

I don't think we really need specific control in xl.cfg as this is a fix for 
some poorly documented semantics in the spec. The flag simply prevents the leaf 
magically appearing on migrate and I think that's enough. 

  Paul

> Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.