[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v3 5/6] xen/x86: move NUMA scan nodes codes from x86 to common


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Wei Chen <Wei.Chen@xxxxxxx>
  • Date: Mon, 29 Aug 2022 10:47:09 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=O41ZtzFsUawgbIUV0P7gtt3+nqLxV1hCpCvyF7lDXNQ=; b=L2rC+wXTbmULO02NKL9v1diBv/kww37YOP7rdtNCv0OhIp46yRwyW3CF6QjVuXOYI+bZdynFHCEUSvtyroIjKIfjRQoPkKY1T3TFbH7sUlaHFLhdnMiWb8IObHJOm51nZMpzZGgcP9EmetGYKcR7bAaDbCqagMchAZYdQSMzjcIbUvjXIcpQVX1jytaWp83SFNyYidMCMeMU+ZFC1mmOf49s6sB/AJrm5HpNGG7+uB4OBJerQ/pr9t9wi2kVpErAwaR/R2EztFJHEwOuMpsKmv3GL8mHmLva0Jj4Lf3eNdpmo57FnTg8KNDJQZuT+vN7inj4YhU3zNmOzLaLgkDD5w==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=O41ZtzFsUawgbIUV0P7gtt3+nqLxV1hCpCvyF7lDXNQ=; b=gDcMAF/HZWI2p7C6nrg7WDU3NlWWyz/8Q62J3uIuThUbaZor7BtpkAKRG4cHojiTZvRzk5N54NuC7IQBsV3sj/ZArvCa3LJzwCgI6bR5ctqVQHmLcC+RxU40ntPw0VJ4VtPK8ch2fTg73duPdAc84mkD0UyZ4lgIYwCUZ7GEXwgAeXT4gECZ8Eq0eJl9jjVuXUqJVS38Nlbflxfw06XxCFru9Ke+yHLr7YgG35vBkmDTbC/yFiwVsmXLzWQe0eyQ5iNzAI5NA6owwEoCbxANl7GzzpREQPsmh2WVIiMN2Hg0xaW/SgjPCp6K1semvlBlvtc0kMsW+LD3y4Frihepxw==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Y69ZOO6UzX1wu5zsiZHWH/SqsS0vTBGHCdokKDEUIIyNLDZi2GAQ8fzC8KzIK/TZnh4C+9Rz3WmJi4Nl3cOyYmljte+MeVODvpEjCFipnCuDODHuSGuGfgzDVdGYbjgv1Gbn5xvonwNARlxnGnJlg+ajhYpCGLHckOSaU37PXIfLLiWryvYgOdB+euO9hcV0/hMkNoM9Ch7mlhcijVEW+DWqsOcALQAMoYhLpvM4X6LFYnyOBt1xti/JwgmOffrrciBoynH8eDuHkQRdMI+3RDHu5I1iZI7gxn8d24WqHPuioz3/R0QlwU+xvxZ6GEh3xijYeu/YpLG02AfVihqnNw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bPvtv7X9GBHgF4b/2tjN51zL6ynKzStlG8jb5SsknlIMj/Ei9kGWQ00uFA/QbFQOIFO4YpGc+r9KWqffhhEr1zf0C31CUZUctEMUgTDPbc/1XY/c8WXau63jvC0Q4P2OoCHN8J7ETfsJOA8BH8WRpCN/wE8OhWV2UdflDQQo2HJgFQ0WUraL7SFRfa6QVBV0KZItEJIzRE6X+jl86hAg5/e4DON0+bP4xwPyRCwGHaybNaDbCoh5ffFkJx4oHvG55cUWJA9Hcrt4qh4NGoW0ydLA/N2P6/F/4nV4FE4AuxtlIGsgN15mJLH2DI/jZUoDs8HwgiOpaxMNdbEpEm4x1Q==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: nd <nd@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 29 Aug 2022 10:47:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHYtdM2Eqjqd47WVk6sRlJr0wlVJa2/lskAgAYfE5A=
  • Thread-topic: [PATCH v3 5/6] xen/x86: move NUMA scan nodes codes from x86 to common

Hi Jan,

> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 2022年8月25日 20:50
> To: Wei Chen <Wei.Chen@xxxxxxx>
> Cc: nd <nd@xxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Roger Pau
> Monné <roger.pau@xxxxxxxxxx>; Wei Liu <wl@xxxxxxx>; George Dunlap
> <george.dunlap@xxxxxxxxxx>; Julien Grall <julien@xxxxxxx>; Stefano
> Stabellini <sstabellini@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH v3 5/6] xen/x86: move NUMA scan nodes codes from x86
> to common
> 
> On 22.08.2022 04:58, Wei Chen wrote:
> > --- a/xen/common/numa.c
> > +++ b/xen/common/numa.c
> > @@ -13,6 +13,21 @@
> >  #include <xen/sched.h>
> >  #include <xen/softirq.h>
> >
> > +static nodemask_t __initdata processor_nodes_parsed;
> > +static nodemask_t __initdata memory_nodes_parsed;
> > +static struct node __initdata nodes[MAX_NUMNODES];
> > +
> > +static int __ro_after_init num_node_memblks;
> 
> unsigned int?
> 

Yes, I will fix it.

> > @@ -36,6 +51,308 @@ bool numa_disabled(void)
> >      return numa_off || arch_numa_disabled(false);
> >  }
> >
> > +void __init numa_set_processor_nodes_parsed(nodeid_t node)
> > +{
> > +    node_set(node, processor_nodes_parsed);
> > +}
> > +
> > +unsigned int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
> 
> bool (and then true/false below)?
> 

Ok.

> > +{
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < num_node_memblks; i++ )
> > +    {
> > +        struct node *nd = &node_memblk_range[i];
> 
> const?
> 
> > +        if ( nd->start <= start && nd->end >= end &&
> > +             memblk_nodeid[i] == node )
> > +            return 1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +static
> > +enum conflicts __init conflicting_memblks(nodeid_t nid, paddr_t start,
> 
> May I ask that you re-flow this to either
> 
> static enum conflicts __init
> conflicting_memblks(nodeid_t nid, paddr_t start,
> 
> or
> 
> static enum conflicts __init conflicting_memblks(
>     nodeid_t nid, paddr_t start,
> 
> ?
> 

Ok.

> > +                                          paddr_t end, paddr_t nd_start,
> > +                                          paddr_t nd_end, unsigned int
> *mblkid)
> > +{
> > +    unsigned int i;
> > +
> > +    /*
> > +     * Scan all recorded nodes' memory blocks to check conflicts:
> > +     * Overlap or interleave.
> > +     */
> > +    for ( i = 0; i < num_node_memblks; i++ )
> > +    {
> > +        struct node *nd = &node_memblk_range[i];
> 
> const?
> 

Ack.

> > +bool __init numa_memblks_available(void)
> > +{
> > +    return num_node_memblks < NR_NODE_MEMBLKS;
> > +}
> 
> This is kind of clumsy, but I have no better suggestion.
> 

Did you mean the whole function or just the name?

> > +/*
> > + * This function will be called by NUMA memory affinity initialization
> to
> > + * update NUMA node's memory range. In this function, we assume all
> memory
> > + * regions belonging to a single node are in one chunk. Holes (or MMIO
> > + * ranges) between them will be included in the node.
> > + *
> > + * So in numa_update_node_memblks, if there are multiple banks for each
> > + * node, start and end are stretched to cover the holes between them,
> and
> > + * it works as long as memory banks of different NUMA nodes don't
> interleave.
> > + */
> > +int __init numa_update_node_memblks(nodeid_t node, unsigned int
> arch_nid,
> 
> The function only ever returns 0 or -EINVAL - please consider switching
> to "bool" return type.

Ok.

> 
> > +                                    paddr_t start, paddr_t size,
> > +                                    const char *prefix,
> > +                                    bool hotplug)
> > +{
> > +    unsigned int i;
> > +    paddr_t end = start + size;
> > +    paddr_t nd_start = start;
> > +    paddr_t nd_end = end;
> > +    struct node *nd = &nodes[node];
> > +
> > +    /*
> > +     * For the node that already has some memory blocks, we will
> > +     * expand the node memory range temporarily to check memory
> > +     * interleaves with other nodes. We will not use this node
> > +     * temp memory range to check overlaps, because it will mask
> > +     * the overlaps in same node.
> > +     *
> > +     * Node with 0 bytes memory doesn't need this expandsion.
> > +     */
> > +    if ( nd->start != nd->end )
> > +    {
> > +        if ( nd_start > nd->start )
> > +            nd_start = nd->start;
> > +
> > +        if ( nd_end < nd->end )
> > +            nd_end = nd->end;
> > +    }
> > +
> > +    /* It is fine to add this area to the nodes data it will be used
> later*/
> 
> Please adjust style here.
> 

Ok.

> > +    switch ( conflicting_memblks(node, start, end, nd_start, nd_end,
> &i) )
> > +    {
> > +    case OVERLAP:
> > +        if ( memblk_nodeid[i] == node )
> > +        {
> > +            bool mismatch = !(hotplug) != !test_bit(i, memblk_hotplug);
> > +
> > +            printk("%sNUMA: %s %u [%"PRIpaddr", %"PRIpaddr"] overlaps
> with itself [%"PRIpaddr", %"PRIpaddr"]\n",
> > +                   mismatch ? KERN_ERR : KERN_WARNING, prefix,
> > +                   arch_nid, start, end - 1,
> > +                   node_memblk_range[i].start, node_memblk_range[i].end
> - 1);
> > +            if ( mismatch )
> > +                return -EINVAL;
> > +            break;
> > +        }
> > +
> > +        printk(KERN_ERR
> > +               "NUMA: %s %u [%"PRIpaddr", %"PRIpaddr"] overlaps
> with %s %u [%"PRIpaddr", %"PRIpaddr"]\n",
> > +               prefix, arch_nid, start, end - 1, prefix,
> > +               numa_node_to_arch_nid(memblk_nodeid[i]),
> > +               node_memblk_range[i].start, node_memblk_range[i].end -
> 1);
> > +        return -EINVAL;
> > +
> > +
> > +    case INTERLEAVE:
> > +        printk(KERN_ERR
> > +               "NUMA: %s %u: [%"PRIpaddr", %"PRIpaddr"] interleaves
> with %s %u memblk [%"PRIpaddr", %"PRIpaddr"]\n",
> > +               prefix, arch_nid, nd_start, nd_end - 1,
> > +               prefix, numa_node_to_arch_nid(memblk_nodeid[i]),
> > +               node_memblk_range[i].start, node_memblk_range[i].end -
> 1);
> > +        return -EINVAL;
> > +
> > +    case NO_CONFLICT:
> > +        break;
> > +    }
> > +
> > +    if ( !hotplug )
> > +    {
> > +        node_set(node, memory_nodes_parsed);
> > +        nd->start = nd_start;
> > +        nd->end = nd_end;
> > +    }
> > +
> > +    if ( strcasecmp("Node", prefix) )
> > +        printk(KERN_INFO "NUMA: Node %u %s %u
> [%"PRIpaddr", %"PRIpaddr"]%s\n",
> > +               node, prefix, arch_nid, start, end - 1,
> > +               hotplug ? " (hotplug)" : "");
> > +    else
> > +        printk(KERN_INFO "NUMA: Node %u [%"PRIpaddr", %"PRIpaddr"]%s\n",
> > +               node, start, end - 1, hotplug ? " (hotplug)" : "");
> 
> Hmm, if I'm not mistaken one of the two printk()s and hence also one of
> the two format strings will be dead code / data on every archiecture.
> I wonder if we don't want to have a HAS_NUMA_FW_NODE_ID (name subject
> to improvment) Kconfig setting to avoid such. I could imagine this to
> become relevant also in other code.
> 

If HAS_NUMA_FW_NODE_ID can avoid such kind of code, I will try to
use it in next version.

> > +static int __init numa_scan_nodes(paddr_t start, paddr_t end)
> 
> This function returns only 0 or -1, i.e. is even more of a candidate
> for having "bool" return type than numa_update_node_memblks().
> 

Ok, I will try it.

> > +{
> > +    unsigned int i;
> > +    nodemask_t all_nodes_parsed;
> > +
> > +    /* First clean up the node list */
> > +    for ( i = 0; i < MAX_NUMNODES; i++ )
> > +        cutoff_node(i, start, end);
> > +
> > +    /* When numa is on with good firmware, we can do numa scan nodes.
> */
> > +    if ( arch_numa_disabled(true) )
> > +        return -1;
> > +
> > +    if ( !nodes_cover_memory() )
> > +    {
> > +        numa_fw_bad();
> > +        return -1;
> > +    }
> > +
> > +    memnode_shift = compute_hash_shift(node_memblk_range,
> num_node_memblks,
> > +                                       memblk_nodeid);
> > +
> > +    if ( memnode_shift < 0 )
> > +    {
> > +        printk(KERN_ERR
> > +               "NUMA: No NUMA node hash function found. Contact
> maintainer\n");
> > +        numa_fw_bad();
> > +        return -1;
> > +    }
> > +
> > +    nodes_or(all_nodes_parsed, memory_nodes_parsed,
> processor_nodes_parsed);
> > +
> > +    /* Finally register nodes */
> > +    for_each_node_mask( i, all_nodes_parsed )
> > +    {
> > +        paddr_t size = nodes[i].end - nodes[i].start;
> > +
> > +        if ( size == 0 )
> > +            printk(KERN_INFO "NUMA: node %u has no memory\n", i);
> > +
> > +        setup_node_bootmem(i, nodes[i].start, nodes[i].end);
> > +    }
> 
> May I suggest to eliminate "size" at this occasion, for being used
> only once and rather not helping readability (imo at least)?
> 

Ok.

> > --- a/xen/include/xen/mm.h
> > +++ b/xen/include/xen/mm.h
> > @@ -58,6 +58,8 @@
> >  #include <xen/perfc.h>
> >  #include <public/memory.h>
> >
> > +extern paddr_t mem_hotplug;
> > +
> >  struct page_info;
> >
> >  void put_page(struct page_info *);
> 
> I'm sorry, but I guess this may go about anywhere in the file, but not
> right at the top. I would probably have put it ahead of npfec_kind_t
> or right after dom_cow.

Ok, I will move the place of it.

Thanks,
Wei Chen

> 
> Jan

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.