[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 02/23] xen: move NUMA_NO_NODE to public memory.h as XEN_NUMA_NO_NODE



>>> On 02.03.15 at 19:19, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 02/03/15 17:52, Ian Campbell wrote:
>> On Mon, 2015-03-02 at 17:48 +0000, David Vrabel wrote:
>>> On 02/03/15 17:43, Andrew Cooper wrote:
>>>> On 02/03/15 17:34, David Vrabel wrote:
>>>>> A guest that previously had 2 vNUMA nodes is migrated to a host with
>>>>> only 1 pNUMA node.  It should still have 2 vNUMA nodes.
>>>> A natural consequence of vNUMA is that the guest must expect the vNUMA
>>>> layout to change across suspend/resume.  The toolstack cannot guarentee
>>>> that it can construct a similar vNUMA layout after a migration.  This
>>>> includes the toolstack indicating that it was unable to make any useful
>>>> NUMA affinity with the memory ranges.
>>> Eep!  I very much doubt we can do anything in Linux except retain the
>>> existing NUMA layout across a save/restore.
>> In the case you mention above I would expect the 2 vnuma nodes to just
>> point to the same single pnuma node.
>>
>> As such I think it's probably not relevant to the need for
>> XEN_NO_NUMA_NODE?
>>
>> Or is that not would be expected?
> 
> If we were to go down that route, the toolstack would need a way of
> signalling "this vNUMA node does not contain memory on a single pNUMA
> node" if there was insufficient free space to make the allocation.

That's quite the opposite of the example above: When collapsing 2
nodes to just one, there's no problem representing things - as Ian
says, just store the same node ID everywhere. Problems arise
when you need to distribute the guest across more nodes than it
originally ran on.

> In this case, a pnode of XEN_NO_NUMA_NODE seems like precisely the
> correct value to report.

As would be the one single node ID.

For the other case I just mentioned: Distances between nodes
may vary, and hence it would still be better to have a way to
indicate what subset of nodes you'd like the allocations to come
from. Granted this can't be represented by the current model, as
it would require node masks instead of node IDs. But (dependent
upon improvements to the page allocator) the tool stack could
still at least hint at what it wants by selecting two nodes
representing the maximum distance it wants to allow for. The
fallback in the page allocator should be tweaked anyway to not
blindly consider any node when needing to allocate outside of
the initially specified node mask, but closest one(s) first. I just
added this to my todo list.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.