[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86/hvm: Advertise and support extended destination IDs for MSI/IO-APIC


  • To: Julian Vetter <julian.vetter@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 25 Feb 2026 16:27:08 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 25 Feb 2026 15:27:25 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.02.2026 09:46, Julian Vetter wrote:
> x2APIC guests with more than 128 vCPUs have APIC IDs above 255, but MSI
> addresses and IO-APIC RTEs only provide an 8-bit destination field.
> Without extended destination ID support, Linux limits the maximum usable
> APIC ID to 255, refusing to bring up vCPUs beyond that limit. So,
> advertise XEN_HVM_CPUID_EXT_DEST_ID in the HVM hypervisor CPUID leaf,
> signalling that guests may use MSI address bits 11:5 and IO-APIC RTE
> bits 55:49 as additional high destination ID bits. This expands the
> destination ID from 8 to 15 bits.

Should MSI and IO-APIC really be covered by just a single bit?

> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -374,7 +374,16 @@ int hvm_set_pci_link_route(struct domain *d, u8 link, u8 
> isa_irq)
>  int hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
>  {
>      uint32_t tmp = (uint32_t) addr;
> -    uint8_t  dest = (tmp & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
> +    /*
> +     * Standard MSI destination address bits 19:12 (8 bits).
> +     * Extended MSI destination address bits 11:5 (7 more bits).
> +     *
> +     * As XEN_HVM_CPUID_EXT_DEST_ID is advertised, the guest may use bits 
> 11:5
> +     * for high destination ID bits, expanding to 15 bits total. Guests 
> unaware
> +     * of this feature set these bits to 0, so this is backwards-compatible.

Is it? Any of the bits being non-zero would presently be ignored, afaics.
I think this needs to be two-way negotiation: Xen advertises the capability,
and the guest has to actively opt in. (Didn't Roger say something along these
lines already?) Also don't you need to restrict use of the wider IDs to
vCPU-s actually running in x2APIC mode? (And what is going to happen if a
vCPU switches back to legacy mode?)

> +     */
> +    uint32_t dest = (MASK_EXTR(tmp, MSI_ADDR_EXT_DEST_ID_MASK) << 8) |

The literal 8 here and again ...

> @@ -411,7 +413,8 @@ static void ioapic_inj_irq(
>  
>  static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>  {
> -    uint16_t dest = vioapic->redirtbl[pin].fields.dest_id;
> +    uint32_t dest = ((uint32_t)vioapic->redirtbl[pin].fields.ext_dest_id << 
> 8) |

... here also would want dealing with, I guess. Use of literal numbers really
should be an exception.

> @@ -594,6 +597,49 @@ int vioapic_get_trigger_mode(const struct domain *d, 
> unsigned int gsi)
>      return vioapic->redirtbl[pin].fields.trig_mode;
>  }
>  
> +static int cf_check ioapic_check(const struct domain *d, 
> hvm_domain_context_t *h)
> +{
> +    const HVM_SAVE_TYPE(IOAPIC) *s = hvm_get_entry(IOAPIC, h);
> +    unsigned int i;
> +
> +    if ( !s )
> +        return -ENODATA;
> +
> +    for ( i = 0; i < ARRAY_SIZE(s->redirtbl); i++ )
> +    {
> +        struct cpuid_leaf res;
> +
> +        /* ext_dest_id bits not set in RTE so continue */
> +        if ( !s->redirtbl[i].fields.ext_dest_id )
> +            continue;
> +
> +        /*
> +         * An RTE in the saved state has ext_dest_id bits set, verify the
> +         * destination XEN advertises XEN_HVM_CPUID_EXT_DEST_ID to the guest.
> +         * If not interrupt routing to APIC IDs > 255 would be broken after
> +         * restore -> -EINVAL!
> +         */
> +        guest_cpuid(d->vcpu[0], 0x40000004, 0, &res);
> +        if ( !(res.a & XEN_HVM_CPUID_EXT_DEST_ID) )
> +        {
> +            printk(XENLOG_G_ERR "HVM restore: dom%d IO-APIC RTE %u has "
> +                                "extended destination ID bits set but "
> +                                "XEN_HVM_CPUID_EXT_DEST_ID is not 
> advertised\n",
> +                                d->domain_id, i);
> +            return -EINVAL;
> +        }
> +
> +        /*
> +         * Found an RTE with ext_dest bits set, but the destination XEN also
> +         * correctly announces XEN_HVM_CPUID_EXT_DEST_ID
> +         * -> All good, no need to check remaining RTEs.
> +         */
> +        break;
> +    }
> +
> +    return 0;
> +}

As indicated before, I think there wants to be a prereq patch introducing
ioapic_check() (preferably covering whatever other bits as well), with the
patch here then merely making adjustments to the function.

I also don't think you want to call guest_cpuid() here. You set the bit
unconditionally there, so no need to check here. (If it was conditionally
set there, the same condition would want using here. I think you will be
able to find examples of this for some of the other bits.)

Also in new code please no dom%d anymore. We have had %pd for quite some
time.

> --- a/xen/arch/x86/include/asm/msi.h
> +++ b/xen/arch/x86/include/asm/msi.h
> @@ -54,6 +54,8 @@
>  #define       MSI_ADDR_DEST_ID_MASK          0x00ff000
>  #define  MSI_ADDR_DEST_ID(dest)              (((dest) << 
> MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
>  
> +#define MSI_ADDR_EXT_DEST_ID_MASK    0x0000fe0

As this isn't part of any architectural spec, it wants a comment as to
its origin.

> --- a/xen/include/public/arch-x86/hvm/save.h
> +++ b/xen/include/public/arch-x86/hvm/save.h
> @@ -359,7 +359,9 @@ union vioapic_redir_entry
>          uint8_t trig_mode:1;
>          uint8_t mask:1;
>          uint8_t reserve:7;
> -        uint8_t reserved[4];
> +        uint8_t reserved[3];
> +        uint8_t ext_dest_rsvd:1;

Why ext_dest_ as a prefix? Does this field need naming at all?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.