[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per partition
On 11.01.2021 09:45, Paul Durrant wrote: >> -----Original Message----- >> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >> Sent: 09 January 2021 00:56 >> To: paul@xxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx >> Cc: wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; anthony.perard@xxxxxxxxxx; >> andrew.cooper3@xxxxxxxxxx; >> george.dunlap@xxxxxxxxxx; jbeulich@xxxxxxxx; julien@xxxxxxx; >> sstabellini@xxxxxxxxxx; >> roger.pau@xxxxxxxxxx >> Subject: Re: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per >> partition >> >> On 08/01/2021 08:32, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >>>> Sent: 08 January 2021 00:47 >>>> To: xen-devel@xxxxxxxxxxxxxxxxxxxx >>>> Cc: paul@xxxxxxx; wl@xxxxxxx; iwj@xxxxxxxxxxxxxx; >>>> anthony.perard@xxxxxxxxxx; >>>> andrew.cooper3@xxxxxxxxxx; george.dunlap@xxxxxxxxxx; jbeulich@xxxxxxxx; >>>> julien@xxxxxxx; >>>> sstabellini@xxxxxxxxxx; roger.pau@xxxxxxxxxx; Igor Druzhinin >>>> <igor.druzhinin@xxxxxxxxxx> >>>> Subject: [PATCH 1/2] viridian: remove implicit limit of 64 VPs per >>>> partition >>>> >>>> TLFS 7.8.1 stipulates that "a virtual processor index must be less than >>>> the maximum number of virtual processors per partition" that "can be >>>> obtained >>>> through CPUID leaf 0x40000005". Furthermore, "Requirements for Implementing >>>> the Microsoft Hypervisor Interface" defines that starting from Windows >>>> Server >>>> 2012, which allowed more than 64 CPUs to be brought up, this leaf can now >>>> contain a value -1 basically assuming the hypervisor has no restriction >>>> while >>>> 0 (that we currently expose) means the default restriction is still >>>> present. >>>> >>>> Along with previous changes exposing ExProcessorMasks this allows a recent >>>> Windows VM with Viridian extension enabled to have more than 64 vCPUs >>>> without >>>> going into immediate BSOD. >>>> >>> >>> This is very odd as I was happily testing with a 128 vCPU VM once >>> ExProcessorMasks was >> implemented... no need for the extra leaf. >>> The documentation for 0x40000005 states " Describes the scale limits >>> supported in the current >> hypervisor implementation. If any >>> value is zero, the hypervisor does not expose the corresponding >>> information". It does not say (in >> section 7.8.1 or elsewhere AFAICT) >>> what implications that has for VP_INDEX. >>> >>> In " Requirements for Implementing the Microsoft Hypervisor Interface" I >>> don't see anything saying >> what the semantics of not >>> implementing leaf 0x40000005 are, only that if implementing it then -1 must >>> be used to break the 64 >> VP limit. It also says that >>> reporting -1 in 0x40000005 means that 40000004.EAX bits 1 and 2 *must* be >>> clear, which is clearly >> not true if ExProcessorMasks is >>> enabled. That document is dated June 13th 2012 so I think it should be >>> taken with a pinch of salt. >>> >>> Have you actually observed a BSOD with >64 vCPUs when ExProcessorMasks is >>> enabled? If so, which >> version of Windows? I'd like to get >>> a repro myself. >> >> I return with more testing that confirm both my and your results. >> >> 1) with ExProcessorMask and 66 vCPUs assigned both current WS19 build and >> pre-release 20270 of vNext boot correctly with no changes >> >> and that would be fine but the existing documentation for ex_processor_masks >> states that: >> "Hence this enlightenment must be specified for guests with more >> than 64 vCPUs if B<hcall_remote_tlb_flush> and/or B<hcall_ipi> are also >> specified." >> >> You then would expect 64+ vCPU VM to boot without ex_processors_mask, >> hcall_remote_tlb_flush and hcall_ipi. > > Indeed. > >> >> So, >> 2) without ExProcessorMaks and 66 vCPUs assigned and with >> hcall_remote_tlb_flush >> and hcall_ipi disabled: WS19 BSODs and vNext fails to initialize secondary >> CPUs >> > > This is not what I recall from testing but I can confirm I see the same now. > >> After applying my change, >> 3) without ExProcessorMaks and 66 vCPUs assigned and with >> hcall_remote_tlb_flush >> and hcall_ipi disabled WS19 and vNext boot correctly >> >> So we either need to change documentation and require ExProcessorMasks for >> all >> VMs with 64+ vCPUs or go with my change. >> > > I think we go with your change. I guess we can conclude then that the > separate viridian flag as part of the base set is the right way to go (to > stop the leaf magically appearing on migrate of existing guests) and leave > ex_processor_masks (and the documentation) as-is. > > You can add my R-b to the patch. That's the unchanged patch then, including the libxl change that I had asked about and that I have to admit I don't fully follow Igor's responses? I'm hesitant to give an ack for that aspect of the change, yet I suppose the libxl maintainers will defer to x86 ones there. Alternatively Andrew or Roger could of course ack this ... Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |