[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] NUMA guest PV enlightenment (was Re:[PATCH 00/11] PV NUMA Guests)



Hi,

(sorry for the late reply, the mail was already scrolled out of the
window ;-). I will split the thread up to allow quicker and more focused responses).

Cui, Dexuan wrote:
> Dulloor wrote:
> ...
> Hi Dulloor,
> In your patches, the toolstack tries to figure out the "best fit
> nodes" for a PV guest and invokes a hypercall set_domain_numa_layout
> to tell the hypervisor to remember the info, and later the PV guest
> invokes a hypercall get_domain_numa_layout to retrieve the info from
> the hypervisor.
> Can this be changed to: the toolstack writes the guest numa info
> directly into a new field in the start_info(or the share_info) (maybe
> in the starndard format of the SRAT/SLIT) and later PV guest reads the
> info and uses acpi_numa_init() to parse the info?  I think in this way
> the new hypercalls can be avoided and the pv numa enlightenment code
> in guest kernel can be minimized.
> I'm asking  this because this is the way how HVM numa patches of
> Andure do(the toolstack passes the info to hvmloader and the latter
> builds SRAT/SLIT for guest)
I think that is a fundamental difference between PV and HVM, where in HVM you naturally have to inject all infos, but PV is mostly querying the info it needs. AFAICS the design of PV Linux is to remove everything that is not absolutely necessary. I once also tried PV NUMA support, but gave up when I discovered that both NUMA and ACPI were turned off in the then-recent PV kernels (read: kudos to Dulloor ;-). I like the ELF hint trick, it solves a big problem we have with HVM guests: Are they NUMA aware or not? If not, the striping is maybe a better option than persisting on the NUMA layout. Only for HVM guests it is almost impossible to know beforehand.
So as far as this goes, I am OK with PV guests using hypercalls to query
the NUMA information, the only thing I would hint is to leverage the
already existing guest NUMA code and actually provide the info in ACPI SRAT/SLIT format. But one has to consider possible runtime changes to the topology, as this is something that ACPI currently does not provide.

Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.