[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 1/3] xen/domain: unify domain ID allocation


  • To: <dmkhn@xxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>
  • Date: Thu, 17 Jul 2025 12:27:48 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=proton.me smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=p1ENLuDZMzNVbgExsyjCzasqq1tyqNrl9pxOtGJO18U=; b=YdQRK5xIU0EF7bQPE3ncMeE6zExf9sj7RTU4cD9uDb874wlp4kTwOwr+gNrf80JtGflHEVxKczdAdQLHHGvg5uG40s6onylKcaiTAXZd5In24PBJqktfMUCB7dsnMRkKY+iuEv1HP/QdU2dPwMiA4jwmh4aX0mzxB4CZGQ9Az6u8dMth8qTGhMqqJRWJfqGCtxB3JO0jMichiACTn0Xed0Sy7090y5Z74ygcF8Zb+ZSIt6aw59QnOs6M0aUl29iFi+Yp9uSwQ58Ki172Ny5gn2QG6s64pXmnPzNIGd/QNq0bRPpc4K0kxLoQxviUZue6WYB/Et1iOxx81AzxxR3OnQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=DA/xzqJK5+CWJtLyF0TjjMjnI8dQLqdgX1cC/xO/9WFleeeptouNCJIUcNNhK+/N8nQK8vLNFLUOmZn7AsqPbjTpt3pl5Bo3Sw7viabDf5yxISLI7wUZcS8eyvqNUgZ9M2NdCH0WPUG+iF8rEIdmwcgV30NBMizLMt2GuTAHIaDly5QfsWr9zzAa6y9ogfNVTNAF+bVtSDLHMYuVzJ3ROYQOXN2kWxpgPnuR07gol6fB3hqFTV6giBeF/jOjoSE4onLmp6E2LN7UB5OMvB52fEv4u6w3ZioiQlD8lUa/gBWcm23D5QjK6aL9iyqQNd9E/2jYYbW09iH8fyhoZm+WwA==
  • Cc: <andrew.cooper3@xxxxxxxxxx>, <anthony.perard@xxxxxxxxxx>, <jbeulich@xxxxxxxx>, <julien@xxxxxxx>, <michal.orzel@xxxxxxx>, <roger.pau@xxxxxxxxxx>, <sstabellini@xxxxxxxxxx>, <dmukhin@xxxxxxxx>, Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 17 Jul 2025 10:40:44 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi,

Sorry I'm so late to this. I have a few suggestions to improve the ergonomics
of domid handling in dom0less/Hyperlaunch.

On Mon Jun 23, 2025 at 8:28 PM CEST, dmkhn wrote:
> From: Denis Mukhin <dmukhin@xxxxxxxx>
>
> Currently, there are two different domain ID allocation implementations:
>
>   1) Sequential IDs allocation in dom0less Arm code based on max_init_domid;
>
>   2) Sequential IDs allocation in XEN_DOMCTL_createdomain; does not use
>      max_init_domid (both Arm and x86).
>
> The domain ID allocation covers dom0 or late hwdom, predefined domains,
> post-boot domains, excluding Xen system domains (domid >=
> DOMID_FIRST_RESERVED).
>
> It makes sense to have a common helper code for such task across architectures
> (Arm and x86) and between dom0less / toolstack domU allocation.
>
> Note, fixing dependency on max_init_domid is out of scope of this patch.
>
> Wrap the domain ID allocation as an arch-independent function domid_alloc() in
> common/domain.c based on the bitmap.
>
> Allocation algorithm:
> - If an explicit domain ID is provided, verify its availability and use it if
>   ID is not used;
> - If DOMID_INVALID is provided, search the range [1..DOMID_FIRST_RESERVED-1],
>   starting from the last used ID. IDs are not wrapped around in dom0less case.
>   Implementation guarantees that two consecutive calls will never return the
>   same ID. ID#0 is reserved for the first boot domain (currently, dom0) and
>   excluded from allocation range.
>
> Remove is_free_domid() helper as it is not needed now.
>
> No functional change intended.
>
> Signed-off-by: Denis Mukhin <dmukhin@xxxxxxxx>
> ---
> Changes from v9:
> - dropped unrelated message formatting from create_dom0()
> - no wraparound of IDs in dom0less case
> - fixed ID#0 treatment
>
> Link to v9: 
> https://lore.kernel.org/r/20250528225030.2652166-2-dmukhin@xxxxxxxx
> ---
>  xen/arch/arm/domain_build.c             |  7 ++-
>  xen/arch/x86/setup.c                    |  7 ++-
>  xen/common/device-tree/dom0less-build.c | 17 +++---
>  xen/common/domain.c                     | 75 +++++++++++++++++++++++++
>  xen/common/domctl.c                     | 42 ++------------
>  xen/include/xen/domain.h                |  3 +
>  6 files changed, 102 insertions(+), 49 deletions(-)
>
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 4ff161887ec3..9fa5143eb98c 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2049,6 +2049,7 @@ void __init create_dom0(void)
>          .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
>      };
>      unsigned int flags = CDF_privileged | CDF_hardware;
> +    domid_t domid;
>      int rc;
>  
>      /* The vGIC for DOM0 is exactly emulating the hardware GIC */
> @@ -2073,7 +2074,11 @@ void __init create_dom0(void)
>      if ( !llc_coloring_enabled )
>          flags |= CDF_directmap;
>  
> -    dom0 = domain_create(0, &dom0_cfg, flags);
> +    domid = domid_alloc(0);

The way I´d expect domid_alloc() to be used is twofold:

  1. "Give me this specific domid"

for which this interface looks fine, perhaps renamed to domid_alloc_exact(domid)

  2. "Give me any domid"

for which we'd benefit more from a domid_alloc()

This removes the heuristics from the interface. Worst-case execution remains the
same, under 500 iterations. (32K minus a little bit, checked 64bits at a time).

> +    if ( domid == DOMID_INVALID )
> +        panic("Error allocating domain ID 0\n");
> +
> +    dom0 = domain_create(domid, &dom0_cfg, flags);
>      if ( IS_ERR(dom0) )
>          panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
>  
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index f32efa7c6045..7adb92d78a18 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1044,8 +1044,11 @@ static struct domain *__init create_dom0(struct 
> boot_info *bi)
>      if ( iommu_enabled )
>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>  
> -    /* Create initial domain.  Not d0 for pvshim. */
> -    bd->domid = get_initial_domain_id();
> +    /* Allocate initial domain ID.  Not d0 for pvshim. */
> +    bd->domid = domid_alloc(get_initial_domain_id());
> +    if ( bd->domid == DOMID_INVALID )
> +        panic("Error allocating domain ID %d\n", get_initial_domain_id());
> +
>      d = domain_create(bd->domid, &dom0_cfg,
>                        pv_shim ? 0 : CDF_privileged | CDF_hardware);
>      if ( IS_ERR(d) )
> diff --git a/xen/common/device-tree/dom0less-build.c 
> b/xen/common/device-tree/dom0less-build.c
> index 3d503c697337..576fdfa6a19a 100644
> --- a/xen/common/device-tree/dom0less-build.c
> +++ b/xen/common/device-tree/dom0less-build.c
> @@ -839,15 +839,13 @@ void __init create_domUs(void)
>          struct xen_domctl_createdomain d_cfg = {0};
>          unsigned int flags = 0U;
>          bool has_dtb = false;
> +        domid_t domid;
>          uint32_t val;
>          int rc;
>  
>          if ( !dt_device_is_compatible(node, "xen,domain") )
>              continue;
>  
> -        if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
> -            panic("No more domain IDs available\n");
> -
>          d_cfg.max_evtchn_port = 1023;
>          d_cfg.max_grant_frames = -1;
>          d_cfg.max_maptrack_frames = -1;
> @@ -965,12 +963,13 @@ void __init create_domUs(void)
>  
>          arch_create_domUs(node, &d_cfg, flags);
>  
> -        /*
> -         * The variable max_init_domid is initialized with zero, so here it's
> -         * very important to use the pre-increment operator to call
> -         * domain_create() with a domid > 0. (domid == 0 is reserved for 
> Dom0)
> -         */
> -        d = domain_create(++max_init_domid, &d_cfg, flags);
> +        domid = domid_alloc(DOMID_INVALID);
> +        if ( domid == DOMID_INVALID )
> +            panic("Error allocating ID for domain %s\n", dt_node_name(node));
> +        if ( max_init_domid < domid )
> +            max_init_domid = domid;
> +
> +        d = domain_create(domid, &d_cfg, flags);
>          if ( IS_ERR(d) )
>              panic("Error creating domain %s (rc = %ld)\n",
>                    dt_node_name(node), PTR_ERR(d));
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 434d32901b1b..be022c720b13 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -66,6 +66,14 @@ DEFINE_RCU_READ_LOCK(domlist_read_lock);
>  static struct domain *domain_hash[DOMAIN_HASH_SIZE];
>  struct domain *domain_list;
>  
> +/*
> + * Domain ID allocator.
> + * Covers dom0 or late hwdom, predefined domains, post-boot domains; excludes
> + * Xen system domains (ID >= DOMID_FIRST_RESERVED).
> + */
> +static DEFINE_SPINLOCK(domid_lock);
> +static DECLARE_BITMAP(domid_bitmap, DOMID_FIRST_RESERVED);
> +
>  /*
>   * Insert a domain into the domlist/hash.  This allows the domain to be 
> looked
>   * up by domid, and therefore to be the subject of hypercalls/etc.
> @@ -1452,6 +1460,8 @@ void domain_destroy(struct domain *d)
>  
>      TRACE_TIME(TRC_DOM0_DOM_REM, d->domain_id);
>  
> +    domid_free(d->domain_id);
> +

Shouldn't this go after domlist_remove()? Otherwise fun things might happen
if a domid is allocated while the old domain that still keeps the old domid
is still in its hash.

The domctl lock (maybe) protects this case implicitly, but it's probably better
to destroy things in a reasonable order.

>      /* Remove from the domlist/hash. */
>      domlist_remove(d);
>  
> @@ -2433,6 +2443,71 @@ void thaw_domains(void)
>      rcu_read_unlock(&domlist_read_lock);
>  }
>  
> +domid_t domid_alloc(domid_t domid)
> +{
> +    static domid_t domid_last;
> +
> +    spin_lock(&domid_lock);
> +
> +    /* Exact match. */
> +    if ( domid < DOMID_FIRST_RESERVED )
> +    {
> +        if ( __test_and_set_bit(domid, domid_bitmap) )
> +            domid = DOMID_INVALID;
> +    }
> +    /*
> +     * Exhaustive search.
> +     *
> +     * Domain ID#0 is reserved for the first boot domain (e.g. control 
> domain)
> +     * and excluded from allocation.
> +     *
> +     * In dom0less build, domains are not dynamically destroyed, so there's 
> no
> +     * need to do a wraparound of the IDs.
> +     */
> +#ifdef CONFIG_DOM0LESS_BOOT

These ifdef guards are problematic. The fact that a platform supports dom0less
doesn't mean that every boot is dom0less (I can boot a non-dom0less system on
a dom0less-capable Xen).

Furthermore, the rationale for panicking on wraparound is because of exhaustion,
but you do have a proper bitmap here to do proper exhaustive search, so IMO,
this branch is not necessary.

> +    else if ( domid_last + 1 >= DOMID_FIRST_RESERVED )
> +    {
> +        domid = DOMID_INVALID;
> +    }
> +#endif
> +    else
> +    {
> +        domid = find_next_zero_bit(domid_bitmap,
> +                                   DOMID_FIRST_RESERVED,
> +                                   domid_last + 1);
> +#ifdef CONFIG_DOM0LESS_BOOT
> +        if ( domid == DOMID_FIRST_RESERVED )
> +            domid = find_next_zero_bit(domid_bitmap,
> +                                       DOMID_FIRST_RESERVED,
> +                                       1);

nit: I'd say 0 is fair game. On Hyperlaunch (and soon dom0less) it'll be 
possible
to have a domU with domid=0 and a hwdom/ctldom with domids != 0 via the domid
property on the DTB.

Starting from 1 might be slightly saner for defence in depth, so it really is
a nit. I don't think being cautious about dom0 is necessarily a bad thing.

> +#endif
> +
> +        if ( domid < DOMID_FIRST_RESERVED )
> +        {
> +            __set_bit(domid, domid_bitmap);
> +            domid_last = domid;

Rather than setting domid_last here, I'd move it right before the spin_unlock()
gated by "if ( domid != DOMID_INVALID )". That'd also bump domid_last in the
exact match case.

It also allows to drop the (then) redundant braces.

> +        }
> +        else
> +        {

nit: redundant braces

> +            domid = DOMID_INVALID;
> +        }
> +    }
> +
> +    spin_unlock(&domid_lock);
> +
> +    return domid;
> +}
> +
> +void domid_free(domid_t domid)
> +{
> +    if ( domid < DOMID_FIRST_RESERVED )
> +    {
> +        spin_lock(&domid_lock);
> +        __clear_bit(domid, domid_bitmap);
> +        spin_unlock(&domid_lock);
> +    }
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index bfe2e1f9f057..8ef0c147c9b0 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -49,20 +49,6 @@ static int xenctl_bitmap_to_nodemask(nodemask_t *nodemask,
>                                     MAX_NUMNODES);
>  }
>  
> -static inline int is_free_domid(domid_t dom)
> -{
> -    struct domain *d;
> -
> -    if ( dom >= DOMID_FIRST_RESERVED )
> -        return 0;
> -
> -    if ( (d = rcu_lock_domain_by_id(dom)) == NULL )
> -        return 1;
> -
> -    rcu_unlock_domain(d);
> -    return 0;
> -}

Good riddance. This is racy without the domctl lock.

> -
>  void getdomaininfo(struct domain *d, struct xen_domctl_getdomaininfo *info)
>  {
>      struct vcpu *v;
> @@ -421,36 +407,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
> u_domctl)
>  
>      case XEN_DOMCTL_createdomain:
>      {
> -        domid_t        dom;
> -        static domid_t rover = 0;
> +        domid_t domid = domid_alloc(op->domain);
>  
> -        dom = op->domain;
> -        if ( (dom > 0) && (dom < DOMID_FIRST_RESERVED) )
> +        if ( domid == DOMID_INVALID )
>          {
>              ret = -EEXIST;

nit: IMO. If createdomain didn't set domctl.domid, we shouldn't return EEXIST,
     but ENOSPC. It's a very impossible case, so I don't care much though.

> -            if ( !is_free_domid(dom) )
> -                break;
> -        }
> -        else
> -        {
> -            for ( dom = rover + 1; dom != rover; dom++ )
> -            {
> -                if ( dom == DOMID_FIRST_RESERVED )
> -                    dom = 1;
> -                if ( is_free_domid(dom) )
> -                    break;
> -            }
> -
> -            ret = -ENOMEM;
> -            if ( dom == rover )
> -                break;
> -
> -            rover = dom;
> +            break;
>          }
>  
> -        d = domain_create(dom, &op->u.createdomain, false);
> +        d = domain_create(domid, &op->u.createdomain, false);
>          if ( IS_ERR(d) )
>          {
> +            domid_free(domid);
>              ret = PTR_ERR(d);
>              d = NULL;
>              break;
> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> index e10baf2615fd..8aab05ae93c8 100644
> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -38,6 +38,9 @@ void arch_get_domain_info(const struct domain *d,
>  
>  domid_t get_initial_domain_id(void);
>  
> +domid_t domid_alloc(domid_t domid);
> +void domid_free(domid_t domid);
> +
>  /* CDF_* constant. Internal flags for domain creation. */
>  /* Is this a privileged domain? */
>  #define CDF_privileged           (1U << 0)

Cheers,
Alejandro




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.