[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v4 7/8] xen/x86: add detection of memory interleaves for different nodes


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Henry Wang <Henry.Wang@xxxxxxx>
  • Date: Mon, 30 May 2022 01:46:34 +0000
  • Accept-language: zh-CN, en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IygYdjhWRBIlBtx6R14du6juZZZeYS99CvQ/hCkgHLo=; b=BMKN/v7blcikn7vE3TlMFr0sBl86S1dG8O8ZsG6u01Bmea5AI820aG9qHoKfd7FBiv2XeU886RlzUIMgs8+13avJvMgSCJ7/duv9+ArFm89NsQ4swVvdtEbHbT8huaXztnX0zcktnrcp5MSNOGbVy5Kk596CWts1jCh4uM7WMAnQ5nlH62aSq4yKuzNYVeSgSumpPL91olYvOMml7HkfQyRVINe6M7zQUFugvx0JzJjTepdu2yGdMNiVvJknbxtW2S1D34fjLKNv00rMKtw1WbgwJZFCKKpj5KfbW+HS34DoQTCNK/TPuLRKoJ9GuBAdVXp1B8XAYwG+ljafXPDyUA==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IygYdjhWRBIlBtx6R14du6juZZZeYS99CvQ/hCkgHLo=; b=FqJi9V7cpkscWVSIDYrZbTSqTg8JZFxT/unw95cjAAth0Cd6hJaJ+P9spGLT3cxMlPupUKWT15VK++UDD1bwG6xzATa9CiwxTi4ns3p/7me0fVWsWq21e892h4dK6Q7M1xeHzZp+ckW2N8cIQhyMuzeITyMqPzPvesEinh2YXtMVpb700rEoWsa6pXshbW1XPeYOy2avlYO8X4/y/gefMyKC+M4HEVC8hW9cRFeujJoL29qb7IMk/4eXBEHrBvu2XfxxoXhkguyweLLrHz1W0/bHwqLrzR2dNspW33iKi6QXG1cSKz5qc4mVQYZCTZS7+BQ/NS2HIII8Q3ymI3BdCA==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=QaCiUhp9l/wTlHr5+/CVEZQod3NwqzOsA+O+WvIBjhRvYCxLgfoC3ywsHPj/0tfobYlaM1rM1csMzbAHgntqy+q1Uh7hXBYHKid7z6Xcdjov1+vHuoV4dEN5Vd245cT90ihuF13Vvt+n/WcFujQtbcL0Lt2ArJhSG98YU1Pa7bK14MO3rgVqjtdiynzVuUrTAauVi7WLecVJtkWtQ88RQwKnK3obhjlS+cXBMHir2SZd0V5oceekrGIq3u4Lc94ApfApNK3Ma8Cjck59k12Hbl8na419rhQGjlbQGra/zFm1ZDjOVaSy4gTE3+ULgnhNgO42/GeJopcFTkVl3lTmZg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=frjy8YMUM5ZM9foWZAdi1FYK1F3Aw2JgMOq/d6Mbz8jRzo+SJusYfhiLbgqHv75XvZ8EZC580o3avAfwm4pcoVIIhXg7xTF2Shpj1BSDfyVWqqZwUZOyVpdxGAXYH8Ukc+LdETuQjCzo4xZkVvwEpryLJkUsgaYIpxKtykLfL5P/d8VSq8X30VTdueKRhkM+a2o+hn7bcLXTWjhRzFjWxidn4EexKGZyY57YOSxSMZoQQdgAGFjLYmdiVrZ+UWR5wSeOKKW+bzhnOTrI+PPjTbCpfxQ0XGVtJfBGTx9BhRIFZduEPnyr3e8poLGenbjpyVEG5oGqMZS3Gs63nqr+5g==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: nd <nd@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Jiamei Xie <Jiamei.Xie@xxxxxxx>
  • Delivery-date: Mon, 30 May 2022 01:47:08 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHYbm4b7teAW0SaSEyIBxFEWwznXq02r7pg
  • Thread-topic: [PATCH v4 7/8] xen/x86: add detection of memory interleaves for different nodes

Hi,

It seems that this patch is the only one that is not properly acked and
reviewed in this series. So this is a gentle ping asking if everyone is happy
with this patch or are there any more actions that should be taken from the
author. Thanks!

Kind regards,
Henry

> -----Original Message-----
> Subject: [PATCH v4 7/8] xen/x86: add detection of memory interleaves for
> different nodes
> 
> One NUMA node may contain several memory blocks. In current Xen
> code, Xen will maintain a node memory range for each node to cover
> all its memory blocks. But here comes the problem, in the gap of
> one node's two memory blocks, if there are some memory blocks don't
> belong to this node (remote memory blocks). This node's memory range
> will be expanded to cover these remote memory blocks.
> 
> One node's memory range contains other nodes' memory, this is
> obviously not very reasonable. This means current NUMA code only
> can support node has no interleaved memory blocks. However, on a
> physical machine, the addresses of multiple nodes can be interleaved.
> 
> So in this patch, we add code to detect memory interleaves of
> different nodes. NUMA initialization will be failed and error
> messages will be printed when Xen detect such hardware configuration.
> 
> Change-Id: Ia7ff9a9128ecbe3eb4dddd1307ae8fbe65575ccf
> Issue-Id: SCM-2240
> Signed-off-by: Wei Chen <wei.chen@xxxxxxx>
> Tested-by: Jiamei Xie <jiamei.xie@xxxxxxx>
> ---
> v3 -> v4:
> 1. Drop "ERR" prefix for enumeration, and remove init value.
> 2. Use "switch case" for enumeration, and add "default:"
> 3. Use "PXM" in log messages.
> 4. Use unsigned int for node memory block id.
> 5. Fix some code-style comments.
> 6. Use "nd->end" in node range expansion check.
> v2 -> v3:
> 1. Merge the check code from a separate function to
>    conflicting_memblks. This will reduce the loop
>    times of node memory blocks.
> 2. Use an enumeration to indicate conflict check status.
> 3. Use a pointer to get conflict memory block id.
> v1 -> v2:
> 1. Update the description to say we're after is no memory
>    interleaves of different nodes.
> 2. Only update node range when it passes the interleave check.
> 3. Don't use full upper-case for "node".
> ---
>  xen/arch/x86/srat.c | 132 +++++++++++++++++++++++++++++++++-----------
>  1 file changed, 101 insertions(+), 31 deletions(-)
> 
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index 8ffe43bdfe..a831df7648 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -42,6 +42,12 @@ static struct node
> node_memblk_range[NR_NODE_MEMBLKS];
>  static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>  static __initdata DECLARE_BITMAP(memblk_hotplug,
> NR_NODE_MEMBLKS);
> 
> +enum conflicts {
> +     NO_CONFLICT,
> +     OVERLAP,
> +     INTERLEAVE,
> +};
> +
>  static inline bool node_found(unsigned idx, unsigned pxm)
>  {
>       return ((pxm2node[idx].pxm == pxm) &&
> @@ -119,20 +125,45 @@ int valid_numa_range(paddr_t start, paddr_t end,
> nodeid_t node)
>       return 0;
>  }
> 
> -static __init int conflicting_memblks(paddr_t start, paddr_t end)
> +static
> +enum conflicts __init conflicting_memblks(nodeid_t nid, paddr_t start,
> +                                       paddr_t end, paddr_t nd_start,
> +                                       paddr_t nd_end, unsigned int
> *mblkid)
>  {
> -     int i;
> +     unsigned int i;
> 
> +     /*
> +      * Scan all recorded nodes' memory blocks to check conflicts:
> +      * Overlap or interleave.
> +      */
>       for (i = 0; i < num_node_memblks; i++) {
>               struct node *nd = &node_memblk_range[i];
> +
> +             *mblkid = i;
> +
> +             /* Skip 0 bytes node memory block. */
>               if (nd->start == nd->end)
>                       continue;
> +             /*
> +              * Use memblk range to check memblk overlaps, include the
> +              * self-overlap case.
> +              */
>               if (nd->end > start && nd->start < end)
> -                     return i;
> +                     return OVERLAP;
>               if (nd->end == end && nd->start == start)
> -                     return i;
> +                     return OVERLAP;
> +             /*
> +              * Use node memory range to check whether new range
> contains
> +              * memory from other nodes - interleave check. We just
> need
> +              * to check full contains situation. Because overlaps have
> +              * been checked above.
> +              */
> +             if (nid != memblk_nodeid[i] &&
> +                 (nd_start < nd->start && nd->end < nd_end))
> +                     return INTERLEAVE;
>       }
> -     return -1;
> +
> +     return NO_CONFLICT;
>  }
> 
>  static __init void cutoff_node(int i, paddr_t start, paddr_t end)
> @@ -275,10 +306,13 @@ acpi_numa_processor_affinity_init(const struct
> acpi_srat_cpu_affinity *pa)
>  void __init
>  acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>  {
> +     enum conflicts status;
> +     struct node *nd;
> +     paddr_t nd_start, nd_end;
>       paddr_t start, end;
>       unsigned pxm;
>       nodeid_t node;
> -     int i;
> +     unsigned int i;
> 
>       if (srat_disabled())
>               return;
> @@ -310,42 +344,78 @@ acpi_numa_memory_affinity_init(const struct
> acpi_srat_mem_affinity *ma)
>               bad_srat();
>               return;
>       }
> +
> +     /*
> +      * For the node that already has some memory blocks, we will
> +      * expand the node memory range temporarily to check memory
> +      * interleaves with other nodes. We will not use this node
> +      * temp memory range to check overlaps, because it will mask
> +      * the overlaps in same node.
> +      *
> +      * Node with 0 bytes memory doesn't need this expandsion.
> +      */
> +     nd_start = start;
> +     nd_end = end;
> +     nd = &nodes[node];
> +     if (nd->start != nd->end) {
> +             if (nd_start > nd->start)
> +                     nd_start = nd->start;
> +
> +             if (nd_end < nd->end)
> +                     nd_end = nd->end;
> +     }
> +
>       /* It is fine to add this area to the nodes data it will be used later*/
> -     i = conflicting_memblks(start, end);
> -     if (i < 0)
> -             /* everything fine */;
> -     else if (memblk_nodeid[i] == node) {
> -             bool mismatch = !(ma->flags &
> ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
> -                             !test_bit(i, memblk_hotplug);
> -
> -             printk("%sSRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr")
> overlaps with itself (%"PRIpaddr"-%"PRIpaddr")\n",
> -                    mismatch ? KERN_ERR : KERN_WARNING, pxm, start,
> end,
> -                    node_memblk_range[i].start,
> node_memblk_range[i].end);
> -             if (mismatch) {
> +     status = conflicting_memblks(node, start, end, nd_start, nd_end,
> &i);
> +     switch(status)
> +     {
> +     case OVERLAP:
> +     {
> +             if (memblk_nodeid[i] == node) {
> +                     bool mismatch = !(ma->flags &
> +
> ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
> +                                     !test_bit(i, memblk_hotplug);
> +
> +                     printk("%sSRAT: PXM %u (%"PRIpaddr"-
> %"PRIpaddr") overlaps with itself (%"PRIpaddr"-%"PRIpaddr")\n",
> +                            mismatch ? KERN_ERR : KERN_WARNING, pxm,
> start,
> +                            end, node_memblk_range[i].start,
> +                            node_memblk_range[i].end);
> +                     if (mismatch) {
> +                             bad_srat();
> +                             return;
> +                     }
> +                     break;
> +             } else {
> +                     printk(KERN_ERR
> +                            "SRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr")
> overlaps with PXM %u (%"PRIpaddr"-%"PRIpaddr")\n",
> +                            pxm, start, end,
> node_to_pxm(memblk_nodeid[i]),
> +                            node_memblk_range[i].start,
> +                            node_memblk_range[i].end);
>                       bad_srat();
>                       return;
>               }
> -     } else {
> +     }
> +
> +     case INTERLEAVE:
> +     {
>               printk(KERN_ERR
> -                    "SRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps
> with PXM %u (%"PRIpaddr"-%"PRIpaddr")\n",
> -                    pxm, start, end, node_to_pxm(memblk_nodeid[i]),
> +                    "SRAT: PXM %u: (%"PRIpaddr"-%"PRIpaddr")
> interleaves with PXM %u memblk (%"PRIpaddr"-%"PRIpaddr")\n",
> +                    node, nd_start, nd_end,
> node_to_pxm(memblk_nodeid[i]),
>                      node_memblk_range[i].start,
> node_memblk_range[i].end);
>               bad_srat();
>               return;
>       }
> -     if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
> -             struct node *nd = &nodes[node];
> 
> -             if (!node_test_and_set(node, memory_nodes_parsed)) {
> -                     nd->start = start;
> -                     nd->end = end;
> -             } else {
> -                     if (start < nd->start)
> -                             nd->start = start;
> -                     if (nd->end < end)
> -                             nd->end = end;
> -             }
> +     default:
> +             break;
> +     }
> +
> +     if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
> +             node_set(node, memory_nodes_parsed);
> +             nd->start = nd_start;
> +             nd->end = nd_end;
>       }
> +
>       printk(KERN_INFO "SRAT: Node %u PXM %u %"PRIpaddr"-
> %"PRIpaddr"%s\n",
>              node, pxm, start, end,
>              ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE ? " (hotplug)" :
> "");
> --
> 2.25.1
> 


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.