[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per partition
> -----Original Message----- > From: Paul Durrant <xadimgnik@xxxxxxxxx> > Sent: 11 January 2021 09:10 > To: 'Jan Beulich' <jbeulich@xxxxxxxx> > Cc: wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; anthony.perard@xxxxxxxxxx; > andrew.cooper3@xxxxxxxxxx; > george.dunlap@xxxxxxxxxx; julien@xxxxxxx; sstabellini@xxxxxxxxxx; > roger.pau@xxxxxxxxxx; xen- > devel@xxxxxxxxxxxxxxxxxxxx; 'Igor Druzhinin' <igor.druzhinin@xxxxxxxxxx> > Subject: RE: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per > partition > > > -----Original Message----- > > From: Jan Beulich <jbeulich@xxxxxxxx> > > Sent: 11 January 2021 09:00 > > To: paul@xxxxxxx > > Cc: wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; anthony.perard@xxxxxxxxxx; > > andrew.cooper3@xxxxxxxxxx; > > george.dunlap@xxxxxxxxxx; julien@xxxxxxx; sstabellini@xxxxxxxxxx; > > roger.pau@xxxxxxxxxx; xen- > > devel@xxxxxxxxxxxxxxxxxxxx; 'Igor Druzhinin' <igor.druzhinin@xxxxxxxxxx> > > Subject: Re: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per > > partition > > > > On 11.01.2021 09:45, Paul Durrant wrote: > > >> -----Original Message----- > > >> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > > >> Sent: 09 January 2021 00:56 > > >> To: paul@xxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx > > >> Cc: wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; anthony.perard@xxxxxxxxxx; > > >> andrew.cooper3@xxxxxxxxxx; > > >> george.dunlap@xxxxxxxxxx; jbeulich@xxxxxxxx; julien@xxxxxxx; > > >> sstabellini@xxxxxxxxxx; > > >> roger.pau@xxxxxxxxxx > > >> Subject: Re: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per > > >> partition > > >> > > >> On 08/01/2021 08:32, Paul Durrant wrote: > > >>>> -----Original Message----- > > >>>> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > > >>>> Sent: 08 January 2021 00:47 > > >>>> To: xen-devel@xxxxxxxxxxxxxxxxxxxx > > >>>> Cc: paul@xxxxxxx; wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; > > >>>> anthony.perard@xxxxxxxxxx; > > >>>> andrew.cooper3@xxxxxxxxxx; george.dunlap@xxxxxxxxxx; > > >>>> jbeulich@xxxxxxxx; julien@xxxxxxx; > > >>>> sstabellini@xxxxxxxxxx; roger.pau@xxxxxxxxxx; Igor Druzhinin > > >>>> <igor.druzhinin@xxxxxxxxxx> > > >>>> Subject: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per > > >>>> partition > > >>>> > > >>>> TLFS 7.8.1 stipulates that "a virtual processor index must be less than > > >>>> the maximum number of virtual processors per partition" that "can be > > >>>> obtained > > >>>> through CPUID leaf 0x40000005". Furthermore, "Requirements for > > >>>> Implementing > > >>>> the Microsoft Hypervisor Interface" defines that starting from Windows > > >>>> Server > > >>>> 2012, which allowed more than 64 CPUs to be brought up, this leaf can > > >>>> now > > >>>> contain a value -1 basically assuming the hypervisor has no > > >>>> restriction while > > >>>> 0 (that we currently expose) means the default restriction is still > > >>>> present. > > >>>> > > >>>> Along with previous changes exposing ExProcessorMasks this allows a > > >>>> recent > > >>>> Windows VM with Viridian extension enabled to have more than 64 vCPUs > > >>>> without > > >>>> going into immediate BSOD. > > >>>> > > >>> > > >>> This is very odd as I was happily testing with a 128 vCPU VM once > > >>> ExProcessorMasks was > > >> implemented... no need for the extra leaf. > > >>> The documentation for 0x40000005 states " Describes the scale limits > > >>> supported in the current > > >> hypervisor implementation. If any > > >>> value is zero, the hypervisor does not expose the corresponding > > >>> information". It does not say > (in > > >> section 7.8.1 or elsewhere AFAICT) > > >>> what implications that has for VP_INDEX. > > >>> > > >>> In " Requirements for Implementing the Microsoft Hypervisor Interface" > > >>> I don't see anything > saying > > >> what the semantics of not > > >>> implementing leaf 0x40000005 are, only that if implementing it then -1 > > >>> must be used to break the > > 64 > > >> VP limit. It also says that > > >>> reporting -1 in 0x40000005 means that 40000004.EAX bits 1 and 2 *must* > > >>> be clear, which is > clearly > > >> not true if ExProcessorMasks is > > >>> enabled. That document is dated June 13th 2012 so I think it should be > > >>> taken with a pinch of > salt. > > >>> > > >>> Have you actually observed a BSOD with >64 vCPUs when ExProcessorMasks > > >>> is enabled? If so, which > > >> version of Windows? I'd like to get > > >>> a repro myself. > > >> > > >> I return with more testing that confirm both my and your results. > > >> > > >> 1) with ExProcessorMask and 66 vCPUs assigned both current WS19 build and > > >> pre-release 20270 of vNext boot correctly with no changes > > >> > > >> and that would be fine but the existing documentation for > > >> ex_processor_masks > > >> states that: > > >> "Hence this enlightenment must be specified for guests with more > > >> than 64 vCPUs if B<hcall_remote_tlb_flush> and/or B<hcall_ipi> are also > > >> specified." > > >> > > >> You then would expect 64+ vCPU VM to boot without ex_processors_mask, > > >> hcall_remote_tlb_flush and hcall_ipi. > > > > > > Indeed. > > > > > >> > > >> So, > > >> 2) without ExProcessorMaks and 66 vCPUs assigned and with > > >> hcall_remote_tlb_flush > > >> and hcall_ipi disabled: WS19 BSODs and vNext fails to initialize > > >> secondary CPUs > > >> > > > > > > This is not what I recall from testing but I can confirm I see the same > > > now. > > > > > >> After applying my change, > > >> 3) without ExProcessorMaks and 66 vCPUs assigned and with > > >> hcall_remote_tlb_flush > > >> and hcall_ipi disabled WS19 and vNext boot correctly > > >> > > >> So we either need to change documentation and require ExProcessorMasks > > >> for all > > >> VMs with 64+ vCPUs or go with my change. > > >> > > > > > > I think we go with your change. I guess we can conclude then that the > > > separate viridian flag as > part > > of the base set is the right way to go (to stop the leaf magically > > appearing on migrate of existing > > guests) and leave ex_processor_masks (and the documentation) as-is. > > > > > > You can add my R-b to the patch. > > > > That's the unchanged patch then, including the libxl change that > > I had asked about and that I have to admit I don't fully follow > > Igor's responses? I'm hesitant to give an ack for that aspect of > > the change, yet I suppose the libxl maintainers will defer to > > x86 ones there. Alternatively Andrew or Roger could of course > > ack this ... > > > > I don't think we really need specific control in xl.cfg as this is a fix for > some poorly documented > semantics in the spec. The flag simply prevents the leaf magically appearing > on migrate and I think > that's enough. ... although adding an option in xl/libxl isn't that much work, I suppose. Igor, would you be ok plumbing it through? Paul > > Paul > > > Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |