[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Logical NUMA error during boot, and RFC patch



>>> On 27.06.12 at 21:10, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> XenServer have recently acquired a quad-socket AMD Interlagos server and
> I have been playing around with it, and discovered a logical error in
> how Xen detects numa nodes.
> 
> The server has 8 NUMA nodes, 4 of which have memory attached (the even
> nodes - see SRAT.dsl attached).  This means that that
> node_set_online(nodeid) gets called for each node with memory attached. 
> Later, in srat_detect_node(), node gets set to 0 if it was NUMA_NO_NODE,
> or if not node_online().  This leads to all the processors on the odd
> nodes being assigned to node 0, even though the odd nodes are present
> (see interlagos-xl-info-n.log)
> 
> I present an RFC patch which changes srat_detect_node() to call
> node_set_online() for each node, which appears to fix the logic.
> 
> Is this a sensible place to set the node online, or is there a better
> way to fix this logic?

While the place looks sensible, it has the possible problem of
potentially adding bits to the online map pretty late in the game.

As the memory-related invocations of node_set_online() come
out of numa_initmem_init()/acpi_scan_nodes(), perhaps the
(boot time) CPU-related ones should be done there too (I'd
still keep the adjustment you're already doing, to also cover
hotplug CPUs)?

Jan

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.