| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
 Re: [PATCH 05/22] x86/srat: vmap the pages for acpi_slit
 
 
On 13/01/2023 09:16, Jan Beulich wrote:
 
On 13.01.2023 00:15, Julien Grall wrote:
 
Hi,
On 04/01/2023 10:23, Jan Beulich wrote:
 
On 23.12.2022 12:31, Julien Grall wrote:
 
On 20/12/2022 15:30, Jan Beulich wrote:
 
On 16.12.2022 12:48, Julien Grall wrote:
 
From: Hongyan Xia <hongyxia@xxxxxxxxxx>
This avoids the assumption that boot pages are in the direct map.
Signed-off-by: Hongyan Xia <hongyxia@xxxxxxxxxx>
Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx>
 
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
However, ...
 
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -139,7 +139,8 @@ void __init acpi_numa_slit_init(struct acpi_table_slit 
*slit)
                return;
        }
        mfn = alloc_boot_pages(PFN_UP(slit->header.length), 1);
-       acpi_slit = mfn_to_virt(mfn_x(mfn));
+       acpi_slit = vmap_contig_pages(mfn, PFN_UP(slit->header.length));
 
... with the increased use of vmap space the VA range used will need
growing. And that's perhaps better done ahead of time than late.
 
I will have a look to increase the vmap().
 
 
+       BUG_ON(!acpi_slit);
 
Similarly relevant for the earlier patch: It would be nice if boot
failure for optional things like NUMA data could be avoided.
 
If you can't map (or allocate the memory), then you are probably in a
very bad situation because both should really not fail at boot.
So I think this is correct to crash early because the admin will be able
to look what went wrong. Otherwise, it may be missed in the noise.
 
Well, I certainly can see one taking this view. However, at least in
principle allocation (or mapping) may fail _because_ of NUMA issues.
 
Right. I read this as the user will likely want to add "numa=off" on the
command line.
 
At which point it would be better to boot with NUMA support turned off
 
I have to disagree with "better" here. This may work for a user with a
handful of hosts. But for large scale setup, you will really want a
failure early rather than having a host booting with an expected feature
disabled (the NUMA issues may be a broken HW).
It is better to fail and then ask the user to specify "numa=off". At
least the person made a conscientious decision to turn off the feature.
 
Yet how would the observing admin make the connection from the BUG_ON()
that triggered and the need to add "numa=off" to the command line,
without knowing Xen internals?
 
I am happy to switch to a panic() that suggests to turn off NUMA.
 
 
I am curious to hear the opinion from the others.
 
So am I.
Jan
 
--
Julien Grall
 
 |