[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/arm: Skip memory nodes if not enabled
Hi Julien, On 27/09/2023 13:01, Julien Grall wrote: > > > Hi Michal, > > On 26/09/2023 09:36, Michal Orzel wrote: >> On 26/09/2023 07:33, Leo Yan wrote: >>> >>> >>> During the Linux kernel booting, an error is reported by the Xen >>> hypervisor: >>> >>> (XEN) arch/arm/p2m.c:2202: d0v0: Failing to acquire the MFN 0x1a02dc >>> >>> The kernel attempts to use an invalid memory frame number, which can be >>> converted to: 0x1a02dc << PAGE_SHIFT, resulting in 0x1_a02d_c000. >>> >>> The invalid memory frame falls into a reserved memory node, which is >>> defined in the device tree as below: >>> >>> reserved-memory { >>> #address-cells = <0x02>; >>> #size-cells = <0x02>; >>> ranges; >>> >>> ... >>> >>> ethosn_reserved { >>> compatible = "shared-dma-pool"; >>> reg = <0x01 0xa0000000 0x00 0x20000000>; >>> status = "disabled"; >>> no-map; >>> }; >>> >>> ... >>> }; >>> >>> Xen excludes all reserved memory regions from the frame management >>> through the dt_unreserved_regions() function. On the other hand, the >>> reserved memory nodes are passed to the Linux kernel. However, the Linux >>> kernel ignores the 'ethosn_reserved' node since its status is >>> "disabled". This leads to the Linux kernel to allocate pages from the >>> reserved memory range. >>> >>> As a result, when the kernel passes the data structure residing in the >>> frame 0x1_a02d_c000 in the Xen hypervisor, the hypervisor detects that >>> it misses to manage the frame and reports the error. >>> >>> Essentially, this issue is caused by the Xen hypervisor which misses to >>> handle the status for the memory nodes (for both the normal memory nodes >>> and the reserved memory nodes) and transfers them to the Linux kernel. >>> >>> This patch introduces a function memory_node_is_available(). If it >>> detects a memory node is not enabled, the hypervisor will not add the >>> memory region into the memory lists. In the end, this avoids to generate >>> the memory nodes from the memory lists sent to the kernel and the kernel >>> cannot use the disabled memory nodes any longer. >> >> Interesting. So FWICS, we have 2 issues that have a common ground: >> 1) If the reserved memory node has a status "disabled", it implies that this >> region >> is no longer reserved and can be used which is not handled today by Xen and >> leads >> to the above mentioned problem. >> >> 2) If the memory node has a status "disabled" it implies that it should not >> be used >> which is not the case in current Xen. This means that at the moment, Xen >> would try >> to use such a memory region which is incorrect. >> >> I think the commit msg should be more generic and focus on these two issues. >> Summary: >> Xen does not handle the status property of memory nodes and ends up using >> them. >> Xen does not handle the status property of reserved memory nodes. If status >> is disabled >> it means the reserved region shall no longer be treated as reserved. Xen >> passes the reserved >> memory nodes untouched to dom0 fdt and creates a memory node to cover >> reserved regions. >> Disabled reserved memory nodes are ignored by the guest which leaves us with >> the exposed >> memory nodes. Guest can access such a region but it is excluded from the >> page management in Xen >> which results in Xen being unable to obtain the corresponding MFN. >> >>> >>> Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx> >>> --- >>> xen/arch/arm/bootfdt.c | 16 ++++++++++++++++ >>> 1 file changed, 16 insertions(+) >>> >>> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c >>> index 2673ad17a1..b46dde06a9 100644 >>> --- a/xen/arch/arm/bootfdt.c >>> +++ b/xen/arch/arm/bootfdt.c >>> @@ -206,11 +206,27 @@ int __init device_tree_for_each_node(const void *fdt, >>> int node, >>> return 0; >>> } >>> >>> +static bool __init memory_node_is_available(const void *fdt, unsigned long >>> node) >> This function is not strictly related to memory node so it would be better >> to call it e.g. device_tree_node_is_available. > > +1. > >>> +{ >>> + const char *status = fdt_getprop(fdt, node, "status", NULL); >>> + >>> + if (!status) >> white spaces between brackets and condition >> if ( !status ) >> >>> + return true; >>> + >>> + if (!strcmp(status, "ok") || !strcmp(status, "okay")) >> white spaces between brackets and condition >> if ( !strcmp(status, "ok") || !strcmp(status, "okay") ) >> >>> + return true; >>> + >>> + return false; >>> +} >>> + >>> static int __init process_memory_node(const void *fdt, int node, >>> const char *name, int depth, >>> u32 address_cells, u32 size_cells, >>> void *data) >>> { >>> + if (!memory_node_is_available(fdt, node)) >>> + return 0; >> I would move this check to device_tree_get_meminfo() > > I am ok with that. But the commit message would need to gain a paragraph > explaining that we will now support "status" for static memory/heap. > >>> + >>> return device_tree_get_meminfo(fdt, node, "reg", address_cells, >>> size_cells, >>> data, MEMBANK_DEFAULT); >>> } >>> -- >>> 2.39.2 >>> >>> >> >> Also, I think it would be nice to add ASSERT(bootinfo.mem.nr_banks); e.g. in >> boot_fdt_info(). >> Otherwise the panic from Xen when there is no memory bank: >> The frametable cannot cover the physical region ... >> is not really meaningful for normal users. >> >> This is just my suggestion (@Julien ?) > > I think a check for the number of banks makes sense. But I would prefer > if the check also happens in production. So, something like: > > if ( !bootinfo.mem.nr_banks ) > panic(...); > > We already have one in the setup_mm() for arm32. So we need another one > for the arm64 version. The other solution is to consolidate it in one > place you suggested. > > I have a slightly preference for having it in setup_mm() even if this is > duplicated. Either way is fine. The advantage of placing the check in boot_fdt_info() is that we can have a single check instead of duplicated and we do the check before the "first" use which happens to be the banks sorting. The advantage of setup_mm() is that it is the one dealing with memory banks and is called after early_print_info() so user can see some additional info. @Leo, will you send a patch for that? Don't feel obliged as it is not strictly related to your patch in which case I can handle it. ~Michal
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |