[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 19/27] xen/riscv: emulate guest writes to virtual APLIC MMIO


  • To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 2 Apr 2026 16:18:52 +0200
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=google header.d=suse.com header.i="@suse.com" header.h="Content-Transfer-Encoding:In-Reply-To:Autocrypt:From:Content-Language:References:Cc:To:Subject:User-Agent:MIME-Version:Date:Message-ID"
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Romain Caritey <Romain.Caritey@xxxxxxxxxxxxx>, Alistair Francis <alistair.francis@xxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 02 Apr 2026 14:19:07 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 10.03.2026 18:08, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/vaplic.c
> +++ b/xen/arch/riscv/vaplic.c
> @@ -20,6 +20,16 @@
>  
>  #include "aplic-priv.h"
>  
> +#define APLIC_REG_GET(addr, offset) \
> +    readl((void *)((vaddr_t)(addr) + offset))
> +#define APLIC_REG_SET(addr, offset, value) \
> +    writel(value, (void *)((vaddr_t)(addr) + offset))

Why is addr properly parenthesized, but offset isn't?

> +#define AUTH_IRQ_BIT(irqnum) (auth_irq_bmp[(irqnum) / APLIC_NUM_REGS] & \
> +                              BIT((irqnum) % APLIC_NUM_REGS, U))
> +
> +#define regval_to_irqn(reg_val) ((reg_val) / sizeof(uint32_t))

I'm trying to make sense of the division here, but I think the main issue
is with naming: It's not a "register value" which is passed into here, but
a register index (offset from a range's base register).

> @@ -127,6 +137,164 @@ int vaplic_map_device_irqs_to_domain(struct domain *d,
>      return 0;
>  }
>  
> +static void vaplic_dm_update_target(const unsigned long hart_id, uint32_t 
> *iprio)
> +{
> +    *iprio &= APLIC_TARGET_IPRIO_MASK;
> +    *iprio |= (hart_id << APLIC_TARGET_HART_IDX_SHIFT);
> +}
> +
> +static void vaplic_update_target(const struct imsic_config *imsic,
> +                                 const int guest_id,
> +                                 const unsigned long hart_id, uint32_t 
> *value)
> +{
> +    unsigned long group_index;
> +    unsigned int hhxw = imsic->group_index_bits;
> +    unsigned int lhxw = imsic->hart_index_bits;
> +    unsigned int hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT * 2;
> +    unsigned long base_ppn = imsic->msi[hart_id].base_addr >> 
> IMSIC_MMIO_PAGE_SHIFT;
> +
> +    group_index = (base_ppn >> (hhxs + 12)) & (BIT(hhxw, UL) - 1);

And there's no constant available to make this literal 12 more descriptive?

> +    *value &= APLIC_TARGET_EIID_MASK;
> +    *value |= guest_id << APLIC_TARGET_GUEST_IDX_SHIFT;
> +    *value |= hart_id << APLIC_TARGET_HART_IDX_SHIFT;
> +    *value |= group_index << (lhxw + APLIC_TARGET_HART_IDX_SHIFT) ;
> +}

Both functions returning void right now, why would they need to return their
result via indirection?

> +#define CALC_REG_VALUE(base) \
> +{ \
> +    uint32_t index; \
> +    uint32_t tmp_val; \

Combine these two, or have the variables have initializers?

> +    index = regval_to_irqn(offset - base); \

There's no "offset" declared or passed into here, nor ...

> +    tmp_val = APLIC_REG_GET(priv->regs, aplic_addr) & ~auth_irq_bmp[index]; \

... "priv", nor ...

> +    value &= auth_irq_bmp[index]; \
> +    value |= tmp_val; \

... "value". It may remain like this, but then it wants putting inside the
sole function that uses it, and be #undef-ed at the end of the function.

> +}

Please wrap in do/while(0), for use sites to be required to have semicolons
(and hence look like normal statements). Or make it a statement expression
properly returning the calculated value.

> +static int cf_check vaplic_emulate_store(const struct vcpu *vcpu,
> +                                         unsigned long addr, uint32_t value)
> +{
> +    struct vaplic *vaplic = to_vaplic(vcpu->domain->arch.vintc);
> +    struct aplic_priv *priv = vaplic->base.info->private;
> +    uint32_t offset = addr & APLIC_REG_OFFSET_MASK;

See ./CODING_STYLE as to uses of fixed-width types.

> +    unsigned long aplic_addr = addr - priv->paddr_start;
> +    const uint32_t *auth_irq_bmp = vcpu->domain->arch.vintc->private;
> +
> +    switch ( offset )
> +    {
> +    case APLIC_SETIP_BASE ... APLIC_SETIP_LAST:

And (taking this just as example) any misaligned accesses falling in this range
are fine?

> +        CALC_REG_VALUE(APLIC_SETIP_BASE);
> +        break;
> +
> +    case APLIC_CLRIP_BASE ... APLIC_CLRIP_LAST:
> +        CALC_REG_VALUE(APLIC_CLRIP_BASE);
> +        break;
> +
> +    case APLIC_SETIE_BASE ... APLIC_SETIE_LAST:
> +        CALC_REG_VALUE(APLIC_SETIE_BASE);
> +        break;
> +
> +    case APLIC_CLRIE_BASE ... APLIC_CLRIE_LAST:
> +        CALC_REG_VALUE(APLIC_CLRIE_BASE);
> +        break;
> +
> +    case APLIC_SOURCECFG_BASE ... APLIC_SOURCECFG_LAST:
> +        /* We don't suppert delagation, so bit10 if sourcecfg should be 0 */
> +        ASSERT(!(value & BIT(10, U)));

And that bit doesn't have a proper #define?

> +        /*
> +         * As sourcecfg register starts from 1:
> +         *   0x0000 domaincfg
> +         *   0x0004 sourcecfg[1]
> +         *   0x0008 sourcecfg[2]
> +         *    ...
> +         *   0x0FFC sourcecfg[1023]
> +         * It is necessary to calculate an interrupt number by substracting

Nit: subtracting

> +         * of APLIC_DOMAINCFG instead of APLIC_SOURCECFG_BASE.
> +         */
> +        if ( !AUTH_IRQ_BIT(regval_to_irqn(offset - APLIC_DOMAINCFG)) )
> +            /* interrupt not enabled, ignore it */

Throughout the series: Please adhere to ./CODING_STYLE.

> +            return 0;
> +
> +        break;

And any value is okay to write?

> +    case APLIC_TARGET_BASE ... APLIC_TARGET_LAST:
> +        struct vcpu *target_vcpu = NULL;
> +
> +        /*
> +         * Look at vaplic_emulate_load() for explanation why
> +         * APLIC_GENMSI is substracted.
> +         */

There's no vaplic_emulate_load() - how can I go look there?

Also same typo again as above.

> +        if ( !AUTH_IRQ_BIT(regval_to_irqn(offset - APLIC_GENMSI)) )
> +            /* interrupt not enabled, ignore it */
> +            return 0;
> +
> +        for ( int i = 0; i < vcpu->domain->max_vcpus; i++ )

unsigned int

> +        {
> +            struct vcpu *v = vcpu->domain->vcpu[i];
> +
> +            if ( v->vcpu_id == (value >> APLIC_TARGET_HART_IDX_SHIFT) )
> +            {
> +                target_vcpu = v;
> +                break;
> +            }
> +        }
> +
> +        ASSERT(target_vcpu);

What guarantees the pointer to be non-NULL? The incoming value can be
arbitrary, afaict.

> +        if ( !(vaplic->regs.domaincfg & APLIC_DOMAINCFG_DM) )
> +        {
> +            vaplic_dm_update_target(cpuid_to_hartid(target_vcpu->processor),
> +                                    &value);
> +        }
> +        else
> +            vaplic_update_target(priv->imsic_cfg,
> +                                 vcpu_guest_file_id(target_vcpu),
> +                                 cpuid_to_hartid(target_vcpu->processor),
> +                                 &value);

I'm struggling with the naming here: When DM is clear, a function with "dm"
in the name is called.

For the latter one, unless other uses are intended speaking against that,
instead of the middle two arguments simply pass target_vcpu?

Also please omit the braces consistently from both branches.

> +        break;
> +
> +    case APLIC_SETIPNUM:
> +    case APLIC_SETIPNUM_LE:

What about APLIC_SETIPNUM_BE?

> +    case APLIC_CLRIPNUM:
> +    case APLIC_SETIENUM:
> +    case APLIC_CLRIENUM:
> +        if ( AUTH_IRQ_BIT(value) )
> +            break;

Aren't you easily overrunning auth_irq_bmp[] here?

> +        return 0;
> +
> +    case APLIC_DOMAINCFG:
> +        /*
> +         * TODO:
> +         * The domaincfg register has this format:
> +         * bits 31:24 read-only 0x80
> +         * bit 8      IE
> +         * bit 7      read-only 0
> +         * bit 2      DM (WARL)
> +         * bit 0      BE (WARL)
> +         *
> +         * The most interesting bit for us is IE(Interrupt Enable) bit.
> +         * At the moment, at least, Linux doesn't use domaincfg.IE bit to
> +         * disable interrupts globally, but if one day someone will use it
> +         * then extra actions should be done.
> +         */
> +
> +        printk_once("%s: Nothing to do, domaincfg is set by aplic during "
> +                    "initialization in Xen\n", __func__);

As per the comment it's not "nothing to do", but your choice to ignore writes
even if they may be relevant.

> +        return 0;
> +
> +    default:
> +        panic("%s: unsupported register offset: %#x\n", __func__, offset);

Crashing the host for the guest doing something odd? It's odd that the function
only ever returns 0 anyway - it could simply return an error here (if the
itention is to not ignore such writes).

As it's not clear what values other than zero such a function may return, I
also can't comment on its (and the hook's) return type (may want to be bool
instead of int).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.