[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH 08/37] xen/x86: add detection of discontinous node memory range


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Wei Chen <Wei.Chen@xxxxxxx>
  • Date: Sun, 26 Sep 2021 10:11:51 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=2CHNZMW4X8CflWPGMqk3n/4C28ulPxVyey+BzhzUUyI=; b=GU4q8ByKuQ5htQCThQzmF9s/NJJrpcHAxsURY4wFg60VizOUtfGpz3nHni/6A7cNBW1lQgwaPun1Ese5aL+l6iP+wyyzfA+HQREh6cRj+nG4bbz1UN6y/nUVO/WrbJq46wv3JN6Gv8fMjm20rE/NhEL3FafYtTyEW0p07B1UvzXUaBPKgiTJFh+sD85ljv9jG1ZWyuAH3Ktu4LyoiSTD2o+CQOkC7cHtTOqHDZpp6EEQkZnlfcTUsp5S9fSO/Atr8/JIEFb1Y+fDqOP0CZ3Ptmawaq6vr89E5r0Wlmrd88KFZVVw2esLuTpGB18Ntwd5Cde4VTiOicEw81rN7fT1Gw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=evIq/tcA7CSHkkKc0NfjdrZx73WWXBu8Z48ztGYnXNlCGQFSsyRq1cM6qtZEJuwR26IvZAsY1E2ohn+xJIFFUqq6s3Di9UWc8I66dsGciUc2LWZtzza6K5H8YesZy5SgQdEAVrJWWYtLS58bpWY/3SqmvAp6MgVsk8eyBby+EzBMXPATWbEZ0xiMI4WfOQb34pfRAPJi8d0fCgUQ0z67YrhPvDVit5ADN0AURd/wnvj+l6jE3BfNDoq+8jSoTjH7NrIWeMaxT1bofb+ScbJTY/a+XDJYbhSYUUB0XWkMvC70j4ZWq5X9lqJ1pEBgTnhnuYlZdwSzf8jp9DzuoIqKkg==
  • Authentication-results-original: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=arm.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "roger.pau@xxxxxxxxxx" <roger.pau@xxxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>
  • Delivery-date: Sun, 26 Sep 2021 10:12:31 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHXsHMRZjZ8eZQrbkW8SEq2x0f00KuyVH0AgABDnECAAQJ+AIACfOAg
  • Thread-topic: [PATCH 08/37] xen/x86: add detection of discontinous node memory range

Hi Stefano,

> -----Original Message-----
> From: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> Sent: 2021年9月25日 3:53
> To: Wei Chen <Wei.Chen@xxxxxxx>
> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx; julien@xxxxxxx; Bertrand Marquis
> <Bertrand.Marquis@xxxxxxx>; jbeulich@xxxxxxxx; andrew.cooper3@xxxxxxxxxx;
> roger.pau@xxxxxxxxxx; wl@xxxxxxx
> Subject: RE: [PATCH 08/37] xen/x86: add detection of discontinous node
> memory range
> 
> On Fri, 24 Sep 2021, Wei Chen wrote:
> > > -----Original Message-----
> > > From: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > > Sent: 2021年9月24日 8:26
> > > To: Wei Chen <Wei.Chen@xxxxxxx>
> > > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; sstabellini@xxxxxxxxxx;
> julien@xxxxxxx;
> > > Bertrand Marquis <Bertrand.Marquis@xxxxxxx>; jbeulich@xxxxxxxx;
> > > andrew.cooper3@xxxxxxxxxx; roger.pau@xxxxxxxxxx; wl@xxxxxxx
> > > Subject: Re: [PATCH 08/37] xen/x86: add detection of discontinous node
> > > memory range
> > >
> > > CC'ing x86 maintainers
> > >
> > > On Thu, 23 Sep 2021, Wei Chen wrote:
> > > > One NUMA node may contain several memory blocks. In current Xen
> > > > code, Xen will maintain a node memory range for each node to cover
> > > > all its memory blocks. But here comes the problem, in the gap of
> > > > one node's two memory blocks, if there are some memory blocks don't
> > > > belong to this node (remote memory blocks). This node's memory range
> > > > will be expanded to cover these remote memory blocks.
> > > >
> > > > One node's memory range contains othe nodes' memory, this is
> obviously
> > > > not very reasonable. This means current NUMA code only can support
> > > > node has continous memory blocks. However, on a physical machine,
> the
> > > > addresses of multiple nodes can be interleaved.
> > > >
> > > > So in this patch, we add code to detect discontinous memory blocks
> > > > for one node. NUMA initializtion will be failed and error messages
> > > > will be printed when Xen detect such hardware configuration.
> > >
> > > At least on ARM, it is not just memory that can be interleaved, but
> also
> > > MMIO regions. For instance:
> > >
> > > node0 bank0 0-0x1000000
> > > MMIO 0x1000000-0x1002000
> > > Hole 0x1002000-0x2000000
> > > node0 bank1 0x2000000-0x3000000
> > >
> > > So I am not familiar with the SRAT format, but I think on ARM the
> check
> > > would look different: we would just look for multiple memory ranges
> > > under a device_type = "memory" node of a NUMA node in device tree.
> > >
> > >
> >
> > Should I need to include/refine above message to commit log?
> 
> Let me ask you a question first.
> 
> With the NUMA implementation of this patch series, can we deal with
> cases where each node has multiple memory banks, not interleaved?

Yes.

> An an example:
> 
> node0: 0x0        - 0x10000000
> MMIO : 0x10000000 - 0x20000000
> node0: 0x20000000 - 0x30000000
> MMIO : 0x30000000 - 0x50000000
> node1: 0x50000000 - 0x60000000
> MMIO : 0x60000000 - 0x80000000
> node2: 0x80000000 - 0x90000000
> 
> 
> I assume we can deal with this case simply by setting node0 memory to
> 0x0-0x30000000 even if there is actually something else, a device, that
> doesn't belong to node0 in between the two node0 banks?

While this configuration is rare in SoC design, but it is not impossible. 

> 
> Is it only other nodes' memory interleaved that cause issues? In other
> words, only the following is a problematic scenario?
> 
> node0: 0x0        - 0x10000000
> MMIO : 0x10000000 - 0x20000000
> node1: 0x20000000 - 0x30000000
> MMIO : 0x30000000 - 0x50000000
> node0: 0x50000000 - 0x60000000
> 
> Because node1 is in between the two ranges of node0?
> 

But only device_type="memory" can be added to allocation.
For mmio there are two cases:
1. mmio doesn't have NUMA id property.
2. mmio has NUMA id property, just like some PCIe controllers.
   But we don’t need to handle these kinds of MMIO devices
   in memory block parsing. Because we don't need to allocate
   memory from these mmio ranges. And for accessing, we need
   a NUMA-aware PCIe controller driver or a generic NUMA-aware
   MMIO accessing APIs.

> 
> I am asking these questions because it is certainly possible to have
> multiple memory ranges for each NUMA node in device tree, either by
> specifying multiple ranges with a single "reg" property, or by
> specifying multiple memory nodes with the same numa-node-id.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.