|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH] device-tree: size first hwdom bank for boot modules
Hi Mykola,
The patch looks good!
I would try to add a CI coverage for QEMU aarch64 tests, since QEMU
supports multiple RAM banks topology.
What do you think?
Also, few remarks below.
On Sun, May 17, 2026 at 11:57:56PM +0300, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@xxxxxxxx>
>
> With LLC coloring enabled, the hardware domain memory comes from
> allocate_hwdom_memory(), not from the fixed direct-map banks used when
> coloring is off.
>
> Commit de99f3263555 ("device-tree: Improve hwdom memory allocation for
> DMA") made that allocator sort free host regions by ascending address so
> Dom0 gets DMA-capable low memory first. The first bank filter still only
> required 128MB. That can select a low region which is large enough for
> the heuristic, but not large enough for place_modules() to put the Dom0
> kernel, generated DTB and initrd contiguously in bank 0.
>
> Ask arch code for any additional first-bank size requirement. On Arm,
> compute it from the actual Dom0 kernel placement, rounded initrd size and
> generated DTB size hint. For 64-bit Image kernels, include the text offset
> from the candidate bank start, because the returned requirement is compared
> with a bank size measured from that start. The hint covers both the normal
> Device Tree path and the minimal DTB created for ACPI boot.
>
> Check the first-bank threshold against the size which will actually be
> assigned to Dom0, after capping the host region by the remaining unassigned
> Dom0 memory. Otherwise a large host region could pass the test but still
> produce a first guest bank too small for place_modules().
>
> Use the typed min()/max() helpers for this normal allocation arithmetic;
> MIN()/MAX() are intended for preprocessor-style contexts and skip the type
> checking provided by the lowercase helpers.
>
> This keeps the DMA-oriented allocation policy from de99f3263555 while
> preventing a too-small bank 0 from reaching place_modules().
>
> Fixes: de99f3263555 ("device-tree: Improve hwdom memory allocation for DMA")
> Signed-off-by: Mykola Kvach <mykola_kvach@xxxxxxxx>
> ---
> Test/setup notes:
>
> The failure was reproduced on a Renesas H3ULCB/R-Car H3 (r8a7795)
> arm64 board booted through U-Boot/TFTP and using huge initrd.
>
> Relevant Xen command line excerpt:
> dom0_mem=2048M llc-coloring=on
>
> Boot module layout from Xen:
> MODULE[2]: 0x0000000084000040-0x000000008e75d92f Ramdisk
> MODULE[3]: 0x00000000a0000000-0x00000000a3ffffff Kernel
> MODULE[4]: 0x00000000a4000000-0x00000000a400ffff XSM Policy
>
> The initrd is about 168MB. With LLC coloring enabled and the low-address
> allocation policy from de99f3263555, Dom0 can receive a 192MB first bank:
> d0 BANK[0] 0x00000048000000-0x00000054000000 (192MB)
>
> That bank satisfies the old 128MB minimum but is too small for the
> rounded Dom0 kernel, generated DTB and initrd placement. The observed
> failure before this patch was:
> Panic on CPU 0:
> Not enough memory in the first bank for the kernel+dtb+initrd
>
> With this patch, the same boot skips the too-small low region for bank 0
> and reaches Dom0:
> d0 BANK[0] 0x00000057000000-0x00000084000000 (720MB)
> d0 BANK[1] 0x0000008e800000-0x000000c0000000 (792MB)
> d0 BANK[2] 0x00000500000000-0x00000521800000 (536MB)
> d0: extended region 0: 0x48000000->0x54000000
> Loading zImage from 0x00000000a0000000 to 0x57000000-0x5b000000
> Loading d0 initrd from 0x0000000084000040 to 0x5f200000-0x6995d8f0
> Loading d0 DTB to 0x5f000000-0x5f011c80
> Linux version 5.10.194-yocto-standard
> ---
> xen/arch/arm/acpi/domain_build.c | 2 --
> xen/arch/arm/domain_build.c | 8 ++++++
> xen/arch/arm/include/asm/domain_build.h | 4 +++
> xen/arch/arm/include/asm/kernel.h | 8 ++++++
> xen/arch/arm/kernel.c | 35 +++++++++++++++++++++++++
> xen/common/device-tree/domain-build.c | 27 ++++++++++++++-----
> xen/include/xen/fdt-kernel.h | 8 ++++++
> 7 files changed, 83 insertions(+), 9 deletions(-)
>
> diff --git a/xen/arch/arm/acpi/domain_build.c
> b/xen/arch/arm/acpi/domain_build.c
> index 249d899c33..db16f7fa94 100644
> --- a/xen/arch/arm/acpi/domain_build.c
> +++ b/xen/arch/arm/acpi/domain_build.c
> @@ -26,8 +26,6 @@
> #undef virt_to_mfn
> #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
>
> -#define ACPI_DOM0_FDT_MIN_SIZE 4096
> -
> static int __init acpi_iomem_deny_access(struct domain *d)
> {
> acpi_status status;
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 1efddc60ef..226e053c68 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -115,6 +115,14 @@ int __init parse_arch_dom0_param(const char *s, const
> char *e)
> (IS_ENABLED(CONFIG_STATIC_SHM) ? \
> (NR_SHMEM_BANKS * (160 + 16)) : 0))
>
> +paddr_t __init dom0_get_fdt_size_hint(void)
> +{
> + if ( !acpi_disabled )
> + return ACPI_DOM0_FDT_MIN_SIZE;
> +
> + return fdt_totalsize(device_tree_flattened) + DOM0_FDT_EXTRA_SIZE;
I would invert the condition so it is read more straightforward:
if ( acpi_disabled )
return fdt_totalsize(device_tree_flattened) + DOM0_FDT_EXTRA_SIZE;
return ACPI_DOM0_FDT_MIN_SIZE;
[..]
> diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
> index b72585b7fe..3644663e2f 100644
> --- a/xen/arch/arm/kernel.c
> +++ b/xen/arch/arm/kernel.c
> @@ -128,6 +128,41 @@ static paddr_t __init kernel_zimage_place(struct
> kernel_info *info)
> return load_addr;
> }
>
> +static paddr_t __init kernel_placement_size(paddr_t load_addr, paddr_t len)
> +{
> + return ROUNDUP(load_addr + len, MB(2)) - load_addr;
> +}
> +
> +paddr_t __init arch_get_min_first_bank_size(struct kernel_info *info,
> + paddr_t bank_start)
> +{
> + const struct boot_module *mod = info->bd.initrd;
> + const paddr_t initrd_len = ROUNDUP(mod ? mod->size : 0, MB(2));
> + const paddr_t dtb_len = ROUNDUP(dom0_get_fdt_size_hint(), MB(2));
> + paddr_t kernsize;
> +
> +#ifdef CONFIG_HAS_DOMAIN_TYPE
Perhaps use `IS_ENABLED(CONFIG_HAS_DOMAIN_TYPE)` to reduce ifdefery?
My impression that IS_ENABLED() is preferred.
> + if ( (info->type == DOMAIN_64BIT) && (info->image.start == 0) )
> + {
> + paddr_t load_addr = bank_start + info->image.text_offset;
> +
> + /*
> + * The caller compares this value with a size measured from
> + * bank_start, so include the text offset before the kernel.
> + */
> + kernsize = ROUNDUP(load_addr + info->image.len, MB(2)) - bank_start;
> + return kernsize + initrd_len + dtb_len;
> + }
> +#endif
> +
> + if ( info->image.start == 0 )
Here too: invert the condition?
if ( info->image.start )
kernsize = kernel_placement_size(info->image.start, info->image.len);
else
...
> + kernsize = ROUNDUP(info->image.len, MB(2));
> + else
> + kernsize = kernel_placement_size(info->image.start, info->image.len);
> +
> + return kernsize + initrd_len + dtb_len;
> +}
--
Denis
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |