[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 08/15] x86/hyperlaunch: locate dom0 kernel with hyperlaunch


  • To: "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 30 Jan 2025 16:42:55 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: jason.andryuk@xxxxxxx, christopher.w.clark@xxxxxxxxx, stefano.stabellini@xxxxxxx, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 30 Jan 2025 15:43:06 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 26.12.2024 17:57, Daniel P. Smith wrote:
> Look for a subnode of type `multiboot,kernel` within a domain node. If found,
> process the reg property for the MB1 module index. If the bootargs property is
> present and there was not an MB1 string, then use the command line from the
> device tree definition.

While multiboot is apparently the first x86-specific part (as far as Xen goes)
to be put under domain-builder/, I wonder:
- Wouldn't looking for "multiboot,kernel" simply yield nothing on non-x86,
  so having the code under common/ would still be okay?
- What's "multiboot" describing here? The origin of the module? (What other
  origins would then be possible? How would MB1 and MB2 be distinguished?
  What about a native xen.efi boot?) A property of the kernel (when Linux
  doesn't use MB)?

> --- a/xen/arch/x86/domain-builder/core.c
> +++ b/xen/arch/x86/domain-builder/core.c
> @@ -59,6 +59,17 @@ void __init builder_init(struct boot_info *bi)
>  
>          printk(XENLOG_INFO "  Number of domains: %d\n", bi->nr_domains);
>      }
> +    else
> +    {
> +        unsigned int i;
> +
> +        /* Find first unknown boot module to use as Dom0 kernel */
> +        printk("Falling back to using first boot module as dom0\n");

Nit (personal taste?): Why Dom0 in the comment and dom0 in the log
message. I think the former is to be preferred, but at the very least
I see no reason to spell it differently on two adjacent lines.

> +        i = first_boot_module_index(bi, BOOTMOD_UNKNOWN);
> +        bi->mods[i].type = BOOTMOD_KERNEL;
> +        bi->domains[0].kernel = &bi->mods[i];
> +        bi->nr_domains = 1;
> +    }

Relating to a question on an earlier patch: The assumption here is
that nothing could have marked another module as BOOTMOD_KERNEL?

> --- a/xen/arch/x86/domain-builder/fdt.c
> +++ b/xen/arch/x86/domain-builder/fdt.c
> @@ -13,6 +13,114 @@
>  
>  #include "fdt.h"
>  
> +static int __init hl_module_index(void *fdt, int node, uint32_t *idx)

const void *?

> +{
> +    int ret = 0;
> +    const struct fdt_property *prop =
> +        fdt_get_property(fdt, node, "module-index", &ret);
> +
> +    /* FDT error or bad idx pointer, translate to -EINVAL */
> +    if ( ret < 0 || idx == NULL )

This is a static helper - why check the parameter for being NULL?

> +        return -EINVAL;
> +
> +    fdt_cell_as_u32((fdt32_t *)prop->data, idx);

While I'm aware libfdt has quite a few of such casts, they're problematic.
First and foremost this is a Misra violation, for casting away const-ness.
And then how do you know there are 4 bytes of data to legitimately access?
Hence why such casts would better be avoided altogether (or at least be
suitably abstracted away).

(There's at least one other instance further down.)

> +    if ( *idx > MAX_NR_BOOTMODS )

>= ?

> +        return -ERANGE;
> +
> +    return 0;
> +}
> +
> +static int __init dom0less_module_index(
> +    void *fdt, int node, int size_size, int address_size, uint32_t *idx)
> +{
> +    uint64_t size = ~0UL, addr = ~0UL;
> +    int ret =
> +        fdt_get_reg_prop(fdt, node, address_size, size_size, &addr, &size, 
> 1);

    int ret = fdt_get_reg_prop(
                  fdt, node, address_size, size_size, &addr, &size, 1);

> +    /* FDT error or bad idx pointer, translate to -EINVAL */
> +    if ( ret < 0 || idx == NULL )

See above as to the NULL check.

> +        return -EINVAL;
> +
> +    /* Convention is that zero size indicates address is an index */
> +    if ( size != 0 )
> +        return -EOPNOTSUPP;
> +
> +    if ( addr > MAX_NR_BOOTMODS )

>= again?

> +        return -ERANGE;
> +
> +    /*
> +     * MAX_NR_BOOTMODS cannot exceed the max for MB1, represented by a u32,
> +     * thus the cast down to a u32 will be safe due to the prior check.
> +     */

Instead of (or in addition to) the comment, put in a BUILD_BUG_ON()?

Also please can you avoid using u32 even in comments? It'll only yield
needless grep matches once we go about fully purging it.

> +    *idx = (uint32_t)addr;
> +
> +    return 0;
> +}
> +
> +static int __init process_domain_node(
> +    struct boot_info *bi, void *fdt, int dom_node)

const twice? (I guess I won't mention such any further. I think I
previously asked that you make things as const-correct as possible.)

> +{
> +    int node;
> +    struct boot_domain *bd = &bi->domains[bi->nr_domains];
> +    const char *name = fdt_get_name(fdt, dom_node, NULL) ?: "unknown";
> +
> +    fdt_for_each_subnode(node, fdt, dom_node)
> +    {
> +        if ( fdt_node_check_compatible(fdt, node, "multiboot,kernel") == 0 )
> +        {
> +            unsigned int idx;
> +            int ret = 0;
> +
> +            if ( bd->kernel )
> +            {
> +                printk(XENLOG_ERR "Duplicate kernel module for domain %s)\n",
> +                       name);

It's XENLOG_ERR here (but a seemingly stray closing parenthesis at the end),
yet ...

> +                continue;
> +            }
> +
> +            /* Try hyperlaunch property, fall back to dom0less property. */
> +            if ( hl_module_index(fdt, node, &idx) < 0 )
> +            {
> +                int address_size = fdt_address_cells(fdt, dom_node);
> +                int size_size = fdt_size_cells(fdt, dom_node);
> +
> +                if ( address_size < 0 || size_size < 0 )
> +                    ret = -EINVAL;
> +                else
> +                    ret = dom0less_module_index(
> +                            fdt, node, size_size, address_size, &idx);
> +            }
> +
> +            if ( ret < 0 )
> +            {
> +                printk("  failed processing kernel module for domain %s)\n",

... two blanks (and the same odd parenthesis) here and ...

> +                       name);
> +                return ret;
> +            }
> +
> +            if ( idx > bi->nr_modules )

>= again?

> +            {
> +                printk("  invalid kernel module index for domain node 
> (%d)\n",

... again two blanks here. What's the deal?

> +                       bi->nr_domains);
> +                return -EINVAL;
> +            }
> +
> +            printk("  kernel: boot module %d\n", idx);

This I expect has two leading blanks to somehow align (normal) output.

> +            bi->mods[idx].type = BOOTMOD_KERNEL;
> +            bd->kernel = &bi->mods[idx];
> +        }
> +    }
> +
> +    if ( !bd->kernel )
> +    {
> +        printk(XENLOG_ERR "ERR: no kernel assigned to domain\n");
> +        return -EFAULT;

EFAULT? Maybe ENODATA or some such?

> +    }
> +
> +    return 0;
> +}
> +
>  static int __init find_hyperlaunch_node(const void *fdt)
>  {
>      int hv_node = fdt_path_offset(fdt, "/chosen/hypervisor");
> @@ -74,9 +182,19 @@ int __init walk_hyperlaunch_fdt(struct boot_info *bi)
>  
>      fdt_for_each_subnode(node, fdt, hv_node)
>      {
> +        if ( bi->nr_domains >= MAX_NR_BOOTDOMS )
> +        {
> +            printk(XENLOG_WARNING "WARN: more domains defined than max 
> allowed");

Missing \n. Also would all HL-related diagnostics perhaps better have a
respective prefix (for disambiguation and grep-ability)?

> --- a/xen/arch/x86/domain-builder/fdt.h
> +++ b/xen/arch/x86/domain-builder/fdt.h
> @@ -3,6 +3,8 @@
>  #define __XEN_X86_FDT_H__
>  
>  #include <xen/init.h>
> +#include <xen/libfdt/libfdt.h>
> +#include <xen/libfdt/libfdt-xen.h>
>  
>  #include <asm/bootinfo.h>
>  
> @@ -10,6 +12,7 @@
>  #define HYPERLAUNCH_MODULE_IDX 0
>  
>  #ifdef CONFIG_DOMAIN_BUILDER
> +
>  int has_hyperlaunch_fdt(struct boot_info *bi);
>  int walk_hyperlaunch_fdt(struct boot_info *bi);
>  #else

I can't explain the need for either of these two hunks.

> --- a/xen/include/xen/libfdt/libfdt-xen.h
> +++ b/xen/include/xen/libfdt/libfdt-xen.h
> @@ -13,6 +13,82 @@
>  
>  #include <xen/libfdt/libfdt.h>
>  
> +static inline int __init fdt_cell_as_u32(const fdt32_t *cell, uint32_t *val)
> +{
> +    *val = fdt32_to_cpu(*cell);
> +
> +    return 0;
> +}
> +
> +static inline int __init fdt_cell_as_u64(const fdt32_t *cell, uint64_t *val)
> +{
> +    *val = ((uint64_t)fdt32_to_cpu(cell[0]) << 32) |
> +           (uint64_t)fdt32_to_cpu(cell[1]);

As we try to conserve on the number of casts: There's no need for the
latter one, is there?

I'll leave it to DT folks to confirm (or otherwise) that the cell indexes
are invariant no matter what the endian-ness.

> +    return 0;

What's the point of this return value for both of the functions? Wouldn't
they better return the value if no error can occur anyway? Afaics none of
the callers checks the return value right now.

> +}
> +
> +/*
> + * Property: reg
> + *
> + * Defined in Section 2.3.6 of the Device Tree Specification is the "reg"
> + * standard property. The property is a prop-encoded-array that is encoded as
> + * an arbitrary number of (address, length) pairs.
> + */
> +static inline int __init fdt_get_reg_prop(
> +    const void *fdt, int node, unsigned int asize, unsigned int ssize,
> +    uint64_t *addr, uint64_t *size, unsigned int pairs)
> +{
> +    int ret;
> +    unsigned int i, count;
> +    const struct fdt_property *prop;
> +    fdt32_t *cell;
> +
> +    /* FDT spec max size is 4 (128bit int), but largest arch int size is 64 
> */
> +    if ( ssize > 2 || asize > 2 )
> +        return -EINVAL;

Hmm, so asize and ssize are already 32-bit granular. Slightly odd.

> +    prop = fdt_get_property(fdt, node, "reg", &ret);
> +    if ( !prop || ret < sizeof(u32) )
> +        return ret < 0 ? ret : -EINVAL;
> +
> +    /* Get the number of (addr, size) pairs and clamp down. */
> +    count = fdt32_to_cpu(prop->len) / (ssize + asize);

What if there's a remainder?

> +    count = count < pairs ? count : pairs;

Use min()?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.