[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH] tools: avoid over-commitment if numa=on
Jan Beulich wrote: Andre Przywara <andre.przywara@xxxxxxx> 09.11.09 16:02 >>>BTW: Shouldn't we set finally numa=on as the default value?I'd say no, at least until the default confinement of a guest to a single node gets fixed to properly deal with guests having more vCPU-s than a node's worth of pCPU-s (i.e. I take it for granted that the benefits of not overcommitting CPUs outweigh the drawbacks of cross-node memory accesses at the very least for CPU-bound workloads). That sounds reasonable.Attached a patch to lift the restriction of one node per guest if the number of VCPUs is greater than the number of cores / node. This isn't optimal (the best way would be to inform the guest about it, but this is another patchset ;-), but should solve the above concerns. Please apply, Andre. Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx> -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 448 3567 12 ----to satisfy European Law for business letters: Advanced Micro Devices GmbH Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen Geschaeftsfuehrer: Andrew Bowd; Thomas M. McCoy; Giuliano Meroni Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632 # HG changeset patch # User Andre Przywara <andre.przywara@xxxxxxx> # Date 1259594006 -3600 # Node ID bdf4109edffbcc0cbac605a19d2fd7a7459f1117 # Parent abc6183f486e66b5721dbf0313ee0d3460613a99 allocate enough NUMA nodes for all VCPUs If numa=on, we constrain a guest to one node to keep it's memory accesses local. This will hurt performance if the number of VCPUs is greater than the number of cores per node. We detect this case now and allocate further NUMA nodes to allow all VCPUs to run simultaneously. Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx> diff -r abc6183f486e -r bdf4109edffb tools/python/xen/xend/XendDomainInfo.py --- a/tools/python/xen/xend/XendDomainInfo.py Mon Nov 30 10:58:23 2009 +0000 +++ b/tools/python/xen/xend/XendDomainInfo.py Mon Nov 30 16:13:26 2009 +0100 @@ -2637,8 +2637,7 @@ nodeload[i] = int(nodeload[i] * 16 / len(info['node_to_cpu'][i])) else: nodeload[i] = sys.maxint - index = nodeload.index( min(nodeload) ) - return index + return map(lambda x: x[0], sorted(enumerate(nodeload), key=lambda x:x[1])) info = xc.physinfo() if info['nr_nodes'] > 1: @@ -2648,8 +2647,15 @@ for i in range(0, info['nr_nodes']): if node_memory_list[i] >= needmem and len(info['node_to_cpu'][i]) > 0: candidate_node_list.append(i) - index = find_relaxed_node(candidate_node_list) - cpumask = info['node_to_cpu'][index] + best_node = find_relaxed_node(candidate_node_list)[0] + cpumask = info['node_to_cpu'][best_node] + cores_per_node = info['nr_cpus'] / info['nr_nodes'] + nodes_required = (self.info['VCPUs_max'] + cores_per_node - 1) / cores_per_node + if nodes_required > 1: + log.debug("allocating %d NUMA nodes", nodes_required) + best_nodes = find_relaxed_node(filter(lambda x: x != best_node, range(0,info['nr_nodes']))) + for i in best_nodes[:nodes_required - 1]: + cpumask = cpumask + info['node_to_cpu'][i] for v in range(0, self.info['VCPUs_max']): xc.vcpu_setaffinity(self.domid, v, cpumask) return index _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |