[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4] NUMA: Introduce NODE_DATA->node_present_pages(RAM pages)
On 29.10.2024 16:53, Jan Beulich wrote: > On 27.10.2024 15:43, Bernhard Kaindl wrote: >> From: Bernhard Kaindl <bernhard.kaindl@xxxxxxxxx> >> >> At the moment, Xen keeps track of the spans of PFNs of the NUMA nodes. >> But the PFN span sometimes includes large MMIO holes, so these values >> might not be an exact representation of the total usable RAM of nodes. >> >> Xen does not need it, but the size of the NUMA node's memory can be >> helpful for management tools and HW information tools like hwloc/lstopo >> with its Xen backend for Dom0: https://github.com/xenserver-next/hwloc/ >> >> First, introduce NODE_DATA(nodeid)->node_present_pages to node_data[], >> determine the sum of usable PFNs at boot and update them on memory_add(). >> >> (The Linux kernel handles NODE_DATA->node_present_pages likewise) >> >> Signed-off-by: Bernhard Kaindl <bernhard.kaindl@xxxxxxxxx> >> --- >> Changes in v3: >> - Use PFN_UP/DOWN, refactored further to simplify the code while leaving >> compiler-level optimisations to the compiler's optimisation passes. >> Changes in v4: >> - Refactored code and doxygen documentation according to the review. >> --- >> xen/arch/x86/numa.c | 13 +++++++++++++ >> xen/arch/x86/x86_64/mm.c | 3 +++ >> xen/common/numa.c | 36 +++++++++++++++++++++++++++++++++--- >> xen/include/xen/numa.h | 21 +++++++++++++++++++++ >> 4 files changed, 70 insertions(+), 3 deletions(-) >> >> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c >> index 4b0b297c7e..3c0574f773 100644 >> --- a/xen/arch/x86/numa.c >> +++ b/xen/arch/x86/numa.c >> @@ -100,6 +100,19 @@ unsigned int __init arch_get_dma_bitsize(void) >> + PAGE_SHIFT, 32); >> } >> >> +/** >> + * @brief Retrieves the RAM range for a given index from the e820 memory >> map. >> + * >> + * This function fetches the start and end address (exclusive) of a RAM >> range >> + * specified by the given index idx from the e820 memory map. > > I think the use of (exclusive) here leaves room for ambiguity (as it may, > unusually, apply to start as well then). Imo it would better be put ... > >> + * @param idx The index of the RAM range in the e820 memory map to retrieve. >> + * @param start Pointer to store the start address of the RAM range. >> + * @param end Pointer to store the end address of the RAM range. > > ... here, just like you have it ... > >> + * @return 0 on success, -ENOENT if the index is out of bounds, >> + * or -ENODATA if the memory map at index idx is not of type >> E820_RAM. >> + */ >> int __init arch_get_ram_range(unsigned int idx, paddr_t *start, paddr_t >> *end) >> { >> if ( idx >= e820.nr_map ) >> --- a/xen/common/numa.c >> +++ b/xen/common/numa.c >> @@ -4,6 +4,7 @@ >> * Adapted for Xen: Ryan Harper <ryanh@xxxxxxxxxx> >> */ >> >> +#include "xen/pfn.h" >> #include <xen/init.h> >> #include <xen/keyhandler.h> >> #include <xen/mm.h> >> @@ -499,15 +500,44 @@ int __init compute_hash_shift(const struct node *nodes, >> return shift; >> } >> >> -/* Initialize NODE_DATA given nodeid and start/end */ >> +/** >> + * @brief Initialize a NUMA node's node_data structure at boot. >> + * >> + * It is given the NUMA node's index in the node_data array as well >> + * as the start and exclusive end address of the node's memory span >> + * as arguments and initializes the node_data entry with this information. >> + * >> + * It then initializes the total number of usable memory pages within >> + * the NUMA node's memory span using the arch_get_ram_range() function. >> + * >> + * @param nodeid The index into the node_data array for the node. >> + * @param start The starting physical address of the node's memory range. >> + * @param end The exclusive ending physical address of the node's memory >> range. > > ... here. > >> + */ >> void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end) >> { >> unsigned long start_pfn = paddr_to_pfn(start); >> unsigned long end_pfn = paddr_to_pfn(end); >> + struct node_data *numa_node = NODE_DATA(nodeid); >> + paddr_t start_ram, end_ram; >> + unsigned int idx = 0; >> + unsigned long *pages = &numa_node->node_present_pages; >> >> - NODE_DATA(nodeid)->node_start_pfn = start_pfn; >> - NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn; >> + numa_node->node_start_pfn = start_pfn; >> + numa_node->node_spanned_pages = end_pfn - start_pfn; >> + >> + /* Calculate the number of present RAM pages within the node: */ >> + *pages = 0; >> + do { >> + int err = arch_get_ram_range(idx++, &start_ram, &end_ram); >> + >> + if (err == -ENOENT) >> + break; >> + if ( err || start_ram >= end || end_ram <= start ) >> + continue; /* range is outside of the node, or not usable RAM */ >> >> + *pages += PFN_DOWN(min(end_ram, end)) - PFN_UP(max(start_ram, >> start)); >> + } while (1); > > Nit: While we have ample bad examples, I think even in such while() uses style > ought to be followed (i.e. "while ( 1 )"). Personally, since this looks a > little > odd to me, I generally prefer "for ( ; ; )" in such cases. > > With respective adjustments (which I'm happy to make while committing, so long > as you agree): Ah, no, I take that back. Alejandro's comments also want addressing, one way or another. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |