[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse device tree memory node
Hi Julien, > -----Original Message----- > From: Julien Grall <julien@xxxxxxx> > Sent: 2021年8月28日 18:34 > To: Wei Chen <Wei.Chen@xxxxxxx>; Stefano Stabellini > <sstabellini@xxxxxxxxxx> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Bertrand Marquis > <Bertrand.Marquis@xxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx> > Subject: Re: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse > device tree memory node > > Hi Wei, > > On 28/08/2021 04:56, Wei Chen wrote: > >> -----Original Message----- > >> From: Stefano Stabellini <sstabellini@xxxxxxxxxx> > >> Sent: 2021��8��28�� 9:06 > >> To: Wei Chen <Wei.Chen@xxxxxxx> > >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; sstabellini@xxxxxxxxxx; > julien@xxxxxxx; > >> jbeulich@xxxxxxxx; Bertrand Marquis <Bertrand.Marquis@xxxxxxx> > >> Subject: Re: [XEN RFC PATCH 23/40] xen/arm: introduce a helper to parse > >> device tree memory node > >> > >> On Wed, 11 Aug 2021, Wei Chen wrote: > >>> Memory blocks' NUMA ID information is stored in device tree's > >>> memory nodes as "numa-node-id". We need a new helper to parse > >>> and verify this ID from memory nodes. > >>> > >>> In order to support memory affinity in later use, the valid > >>> memory ranges and NUMA ID will be saved to tables. > >>> > >>> Signed-off-by: Wei Chen <wei.chen@xxxxxxx> > >>> --- > >>> xen/arch/arm/numa_device_tree.c | 130 > ++++++++++++++++++++++++++++++++ > >>> 1 file changed, 130 insertions(+) > >>> > >>> diff --git a/xen/arch/arm/numa_device_tree.c > >> b/xen/arch/arm/numa_device_tree.c > >>> index 37cc56acf3..bbe081dcd1 100644 > >>> --- a/xen/arch/arm/numa_device_tree.c > >>> +++ b/xen/arch/arm/numa_device_tree.c > >>> @@ -20,11 +20,13 @@ > >>> #include <xen/init.h> > >>> #include <xen/nodemask.h> > >>> #include <xen/numa.h> > >>> +#include <xen/libfdt/libfdt.h> > >>> #include <xen/device_tree.h> > >>> #include <asm/setup.h> > >>> > >>> s8 device_tree_numa = 0; > >>> static nodemask_t processor_nodes_parsed __initdata; > >>> +static nodemask_t memory_nodes_parsed __initdata; > >>> > >>> static int srat_disabled(void) > >>> { > >>> @@ -55,6 +57,79 @@ static int __init > >> dtb_numa_processor_affinity_init(nodeid_t node) > >>> return 0; > >>> } > >>> > >>> +/* Callback for parsing of the memory regions affinity */ > >>> +static int __init dtb_numa_memory_affinity_init(nodeid_t node, > >>> + paddr_t start, paddr_t size) > >>> +{ > >>> + struct node *nd; > >>> + paddr_t end; > >>> + int i; > >>> + > >>> + if ( srat_disabled() ) > >>> + return -EINVAL; > >>> + > >>> + end = start + size; > >>> + if ( num_node_memblks >= NR_NODE_MEMBLKS ) > >>> + { > >>> + dprintk(XENLOG_WARNING, > >>> + "Too many numa entry, try bigger NR_NODE_MEMBLKS \n"); > >>> + bad_srat(); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + /* It is fine to add this area to the nodes data it will be used > >> later */ > >>> + i = conflicting_memblks(start, end); > >>> + /* No conflicting memory block, we can save it for later usage */; > >>> + if ( i < 0 ) > >>> + goto save_memblk; > >>> + > >>> + if ( memblk_nodeid[i] == node ) { > >>> + /* > >>> + * Overlaps with other memblk in the same node, warning here. > >>> + * This memblk will be merged with conflicted memblk later. > >>> + */ > >>> + printk(XENLOG_WARNING > >>> + "DT: NUMA NODE %u (%"PRIx64 > >>> + "-%"PRIx64") overlaps with itself (%"PRIx64"- > >> %"PRIx64")\n", > >>> + node, start, end, > >>> + node_memblk_range[i].start, node_memblk_range[i].end); > >>> + } else { > >>> + /* > >>> + * Conflict with memblk in other node, this is an error. > >>> + * The NUMA information is invalid, NUMA will be turn off. > >>> + */ > >>> + printk(XENLOG_ERR > >>> + "DT: NUMA NODE %u (%"PRIx64"-%" > >>> + PRIx64") overlaps with NODE %u (%"PRIx64"- > %"PRIx64")\n", > >>> + node, start, end, memblk_nodeid[i], > >>> + node_memblk_range[i].start, node_memblk_range[i].end); > >>> + bad_srat(); > >>> + return -EINVAL; > >>> + } > >>> + > >>> +save_memblk: > >>> + nd = &nodes[node]; > >>> + if ( !node_test_and_set(node, memory_nodes_parsed) ) { > >>> + nd->start = start; > >>> + nd->end = end; > >>> + } else { > >>> + if ( start < nd->start ) > >>> + nd->start = start; > >>> + if ( nd->end < end ) > >>> + nd->end = end; > >>> + } > >>> + > >>> + printk(XENLOG_INFO "DT: NUMA node %u %"PRIx64"-%"PRIx64"\n", > >>> + node, start, end); > >>> + > >>> + node_memblk_range[num_node_memblks].start = start; > >>> + node_memblk_range[num_node_memblks].end = end; > >>> + memblk_nodeid[num_node_memblks] = node; > >>> + num_node_memblks++; > >> > >> > >> Is it possible to have non-contigous ranges of memory for a single NUMA > >> node? > >> > >> Looking at the DT bindings and Linux implementation, it seems possible. > >> Here, it seems that node_memblk_range/memblk_nodeid could handle it, > >> but nodes couldn't. > > > > Yes, you're right. I copied this code for x86 ACPI NUMA. Does ACPI allow > > non-contiguous ranges of memory for a single NUMA node too? > > I couldn't find any restriction for ACPI. Although, I only briefly > looked at the spec. > > > If yes, I think > > this will affect x86 ACPI NUMA too. In next version, we plan to merge > > dtb_numa_memory_affinity_init and acpi_numa_memory_affinity_init into a > > neutral function. So we can fix them at the same time. > > > > If not, maybe we have to keep the diversity for dtb and ACPI here. > > I am not entirely sure what you mean. Are you saying if ACPI doesn't > allow non-contiguous ranges of memory, then we should keep the > implementation separated? > > If so, then I disagree with that. It is fine to have code that supports > more than what a firmware table supports. The main benefit is less code > and therefore less long term maintenance (with the current solution we > would need to check both the ACPI and DT implementation if there is a > bug in one). > Yes, I agree. > Cheers, > > -- > Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |