[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH v3 0/6] Device tree based NUMA support for Arm - Part#2
(The Arm device tree based NUMA support patch set contains 35 patches. In order to make stuff easier for reviewers, I split them into 3 parts: 1. Preparation. I have re-sorted the patch series. And moved independent patches to the head of the series - merged in [1] 2. Move generically usable code from x86 to common - this series. 3. Add new code to support Arm. This series only contains the second part patches. As the whole NUMA series has been reviewed for 1 round in [2], so this series would be v3) Xen memory allocation and scheduler modules are NUMA aware. But actually, on x86 has implemented the architecture APIs to support NUMA. Arm was providing a set of fake architecture APIs to make it compatible with NUMA awared memory allocation and scheduler. Arm system was working well as a single node NUMA system with these fake APIs, because we didn't have multiple nodes NUMA system on Arm. But in recent years, more and more Arm devices support multiple nodes NUMA system. So now we have a new problem. When Xen is running on these Arm devices, Xen still treat them as single node SMP systems. The NUMA affinity capability of Xen memory allocation and scheduler becomes meaningless. Because they rely on input data that does not reflect real NUMA layout. Xen still think the access time for all of the memory is the same for all CPUs. However, Xen may allocate memory to a VM from different NUMA nodes with different access speeds. This difference can be amplified in workloads inside VM, causing performance instability and timeouts. So in this patch series, we implement a set of NUMA API to use device tree to describe the NUMA layout. We reuse most of the code of x86 NUMA to create and maintain the mapping between memory and CPU, create the matrix between any two NUMA nodes. Except ACPI and some x86 specified code, we have moved other code to common. In next stage, when we implement ACPI based NUMA for Arm64, we may move the ACPI NUMA code to common too, but in current stage, we keep it as x86 only. This patch serires has been tested and booted well on one Arm64 NUMA machine and one HPE x86 NUMA machine. [1] https://lists.xenproject.org/archives/html/xen-devel/2022-06/msg00499.html [2] https://lists.xenproject.org/archives/html/xen-devel/2021-09/msg01903.html --- v2 -> v3: 1. Drop enumeration of numa status. 2. Use helpers to get/update acpi_numa. 3. Insert spaces among parameters of strncmp in numa_setup. 4. Drop helpers to access mem_hotplug. Export mem_hotplug for all arch. 5. Remove acpi.h from common/numa.c. 6. Rename acpi_scan_nodes to numa_scan_nodes. 7. Replace u8 by uint8_t for memnodemap. 8. Use unsigned int for memnode_shift and adjust related functions (compute_hash_shift, populate_memnodemap) to use correct types for return values or parameters. 9. Use nodeid_t for nodeid and node numbers. 10. Use __read_mostly and __ro_after_init for appropriate variables. 11. Adjust the __read_mostly and __initdata location for some variables. 12. Convert from plain int to unsigned for cpuid and other proper 13. Remove unnecessary change items in history. 14. Rename arch_get_memory_map to arch_get_ram_range. 15. Use -ENOENT instead of -ENODEV to indicate end of memory map. 16. Add description to code comment that arch_get_ram_range returns RAM range in [start, end) format. 17. Rename bad_srat to numa_fw_bad. 18. Rename node_to_pxm to numa_node_to_arch_nid. 19. Merge patch#7 and #8 into patch#6. 20. Move NR_NODE_MEMBLKS from x86/acpi.h to common/numa.h 22. Use 2-64 for node range. v1 -> v2: 1. Refine the commit messages of several patches. 2. Merge v1 patch#9,10 into one patch. Introduce the new functions in the same patch that this patch will be used first time. 3. Fold if ( end > mem_hotplug ) to mem_hotplug_update_boundary, in this case, we can drop mem_hotplug_boundary. 4. Remove fw_numa, use enumeration to replace numa_off and acpi_numa. 5. Correct return value of srat_disabled. 6. Introduce numa_enabled_with_firmware. 7. Refine the justification of using !node_data[nid].node_spanned_pages. 8. Use ASSERT to replace VIRTUAL_BUG_ON in phys_to_nid. 9. Adjust the conditional express for ASSERT. 10. Move MAX_NUMNODES from xen/numa.h to asm/numa.h for x86. 11. Use conditional macro to gate MAX_NUMNODES for other architectures. 12. Use arch_get_memory_map to replace arch_get_memory_bank_range and arch_get_memory_bank_number. 13. Remove the !start || !end check, because caller guarantee these two pointers will not be NULL. 14. Add code comment for numa_update_node_memblks to explain: Assumes all memory regions belonging to a single node are in one chunk. Holes between them will be included in the node. 15. Merge this single patch instead of serval patches to move x86 SRAT code to common. 16. Export node_to_pxm to keep pxm information in NUMA scan nodes error messages. 17. Change the code style to target file's Xen code-style. 18. Adjust some __init and __initdata for some functions and variables. 19. Replace CONFIG_ACPI_NUMA by CONFIG_NUMA. Replace "SRAT" texts. 20. Turn numa_scan_nodes to static. 21. Change NR_NUMA_NODES upper bound from 4095 to 255. Wei Chen (6): xen/x86: Provide helpers for common code to access acpi_numa xen/x86: move generically usable NUMA code from x86 to common xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for phys_to_nid xen/x86: use arch_get_ram_range to get information from E820 map xen/x86: move NUMA scan nodes codes from x86 to common xen: introduce a Kconfig option to configure NUMA nodes number xen/arch/Kconfig | 11 + xen/arch/x86/include/asm/acpi.h | 2 - xen/arch/x86/include/asm/mm.h | 2 - xen/arch/x86/include/asm/numa.h | 61 +-- xen/arch/x86/include/asm/setup.h | 1 - xen/arch/x86/mm.c | 2 - xen/arch/x86/numa.c | 448 ++---------------- xen/arch/x86/smpboot.c | 2 +- xen/arch/x86/srat.c | 313 ++----------- xen/common/Makefile | 1 + xen/common/numa.c | 757 +++++++++++++++++++++++++++++++ xen/common/page_alloc.c | 2 + xen/include/xen/mm.h | 2 + xen/include/xen/numa.h | 87 +++- 14 files changed, 916 insertions(+), 775 deletions(-) create mode 100644 xen/common/numa.c -- 2.25.1
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |