[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: Skip memory nodes if not enabled



Hi Michal,

On 26/09/2023 09:36, Michal Orzel wrote:
On 26/09/2023 07:33, Leo Yan wrote:


During the Linux kernel booting, an error is reported by the Xen
hypervisor:

   (XEN) arch/arm/p2m.c:2202: d0v0: Failing to acquire the MFN 0x1a02dc

The kernel attempts to use an invalid memory frame number, which can be
converted to: 0x1a02dc << PAGE_SHIFT, resulting in 0x1_a02d_c000.

The invalid memory frame falls into a reserved memory node, which is
defined in the device tree as below:

   reserved-memory {
           #address-cells = <0x02>;
           #size-cells = <0x02>;
           ranges;

           ...

           ethosn_reserved {
                   compatible = "shared-dma-pool";
                   reg = <0x01 0xa0000000 0x00 0x20000000>;
                   status = "disabled";
                   no-map;
           };

           ...
   };

Xen excludes all reserved memory regions from the frame management
through the dt_unreserved_regions() function. On the other hand, the
reserved memory nodes are passed to the Linux kernel. However, the Linux
kernel ignores the 'ethosn_reserved' node since its status is
"disabled". This leads to the Linux kernel to allocate pages from the
reserved memory range.

As a result, when the kernel passes the data structure residing in the
frame 0x1_a02d_c000 in the Xen hypervisor, the hypervisor detects that
it misses to manage the frame and reports the error.

Essentially, this issue is caused by the Xen hypervisor which misses to
handle the status for the memory nodes (for both the normal memory nodes
and the reserved memory nodes) and transfers them to the Linux kernel.

This patch introduces a function memory_node_is_available(). If it
detects a memory node is not enabled, the hypervisor will not add the
memory region into the memory lists. In the end, this avoids to generate
the memory nodes from the memory lists sent to the kernel and the kernel
cannot use the disabled memory nodes any longer.

Interesting. So FWICS, we have 2 issues that have a common ground:
1) If the reserved memory node has a status "disabled", it implies that this 
region
is no longer reserved and can be used which is not handled today by Xen and 
leads
to the above mentioned problem.

2) If the memory node has a status "disabled" it implies that it should not be 
used
which is not the case in current Xen. This means that at the moment, Xen would 
try
to use such a memory region which is incorrect.

I think the commit msg should be more generic and focus on these two issues.
Summary:
Xen does not handle the status property of memory nodes and ends up using them.
Xen does not handle the status property of reserved memory nodes. If status is 
disabled
it means the reserved region shall no longer be treated as reserved. Xen passes 
the reserved
memory nodes untouched to dom0 fdt and creates a memory node to cover reserved 
regions.
Disabled reserved memory nodes are ignored by the guest which leaves us with 
the exposed
memory nodes. Guest can access such a region but it is excluded from the page 
management in Xen
which results in Xen being unable to obtain the corresponding MFN.


Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>
---
  xen/arch/arm/bootfdt.c | 16 ++++++++++++++++
  1 file changed, 16 insertions(+)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 2673ad17a1..b46dde06a9 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -206,11 +206,27 @@ int __init device_tree_for_each_node(const void *fdt, int 
node,
      return 0;
  }

+static bool __init memory_node_is_available(const void *fdt, unsigned long 
node)
This function is not strictly related to memory node so it would be better to 
call it e.g. device_tree_node_is_available.

+1.

+{
+    const char *status = fdt_getprop(fdt, node, "status", NULL);
+
+    if (!status)
white spaces between brackets and condition
if ( !status )

+        return true;
+
+    if (!strcmp(status, "ok") || !strcmp(status, "okay"))
white spaces between brackets and condition
if ( !strcmp(status, "ok") || !strcmp(status, "okay") )

+        return true;
+
+    return false;
+}
+
  static int __init process_memory_node(const void *fdt, int node,
                                        const char *name, int depth,
                                        u32 address_cells, u32 size_cells,
                                        void *data)
  {
+    if (!memory_node_is_available(fdt, node))
+        return 0;
I would move this check to device_tree_get_meminfo()

I am ok with that. But the commit message would need to gain a paragraph explaining that we will now support "status" for static memory/heap.

+
      return device_tree_get_meminfo(fdt, node, "reg", address_cells, 
size_cells,
                                     data, MEMBANK_DEFAULT);
  }
--
2.39.2



Also, I think it would be nice to add ASSERT(bootinfo.mem.nr_banks); e.g. in 
boot_fdt_info().
Otherwise the panic from Xen when there is no memory bank:
The frametable cannot cover the physical region ...
is not really meaningful for normal users.

This is just my suggestion (@Julien ?)

I think a check for the number of banks makes sense. But I would prefer if the check also happens in production. So, something like:

if ( !bootinfo.mem.nr_banks )
  panic(...);

We already have one in the setup_mm() for arm32. So we need another one for the arm64 version. The other solution is to consolidate it in one place you suggested.

I have a slightly preference for having it in setup_mm() even if this is duplicated.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.