[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 13/30] ARM: GICv3: forward pending LPIs to guests



Hi Stefano,

thanks for spending your brain cells on this nasty code.

...

On 06/04/17 00:45, Stefano Stabellini wrote:
> On Thu, 6 Apr 2017, Andre Przywara wrote:
>> Upon receiving an LPI on the host, we need to find the right VCPU and
>> virtual IRQ number to get this IRQ injected.
>> Iterate our two-level LPI table to find this information quickly when
>> the host takes an LPI. Call the existing injection function to let the
>> GIC emulation deal with this interrupt.
>> Also we enhance struct pending_irq to cache the pending bit and the
>> priority information for LPIs, as we can't afford to walk the tables in
>> guest memory every time we handle an incoming LPI.
>> This introduces a do_LPI() as a hardware gic_ops and a function to
>> retrieve the (cached) priority value of an LPI and a vgic_ops.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
>> ---
>>  xen/arch/arm/gic-v2.c            |  7 +++++
>>  xen/arch/arm/gic-v3-lpi.c        | 56 
>> ++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c            |  1 +
>>  xen/arch/arm/gic.c               |  8 +++++-
>>  xen/arch/arm/vgic-v2.c           |  7 +++++
>>  xen/arch/arm/vgic-v3.c           | 12 +++++++++
>>  xen/arch/arm/vgic.c              |  7 ++++-
>>  xen/include/asm-arm/gic.h        |  2 ++
>>  xen/include/asm-arm/gic_v3_its.h |  8 ++++++
>>  xen/include/asm-arm/vgic.h       |  3 +++
>>  10 files changed, 109 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
>> index 270a136..f4d7949 100644
>> --- a/xen/arch/arm/gic-v2.c
>> +++ b/xen/arch/arm/gic-v2.c
>> @@ -1217,6 +1217,12 @@ static int __init gicv2_init(void)
>>      return 0;
>>  }
>>  
>> +void gicv2_do_LPI(unsigned int lpi)
>> +{
>> +    /* No LPIs in a GICv2 */
>> +    BUG();
>> +}
>> +
>>  const static struct gic_hw_operations gicv2_ops = {
>>      .info                = &gicv2_info,
>>      .init                = gicv2_init,
>> @@ -1244,6 +1250,7 @@ const static struct gic_hw_operations gicv2_ops = {
>>      .make_hwdom_madt     = gicv2_make_hwdom_madt,
>>      .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
>>      .iomem_deny_access   = gicv2_iomem_deny_access,
>> +    .do_LPI              = gicv2_do_LPI,
>>  };
>>  
>>  /* Set up the GIC */
>> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
>> index 0785701..d8baebc 100644
>> --- a/xen/arch/arm/gic-v3-lpi.c
>> +++ b/xen/arch/arm/gic-v3-lpi.c
>> @@ -136,6 +136,62 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, 
>> bool use_pta)
>>          return per_cpu(lpi_redist, cpu).redist_id << 16;
>>  }
>>  
>> +/*
>> + * Handle incoming LPIs, which are a bit special, because they are 
>> potentially
>> + * numerous and also only get injected into guests. Treat them specially 
>> here,
>> + * by just looking up their target vCPU and virtual LPI number and hand it
>> + * over to the injection function.
>> + * Please note that LPIs are edge-triggered only, also have no active state,
>> + * so spurious interrupts on the host side are no issue (we can just ignore
>> + * them).
>> + * Also a guest cannot expect that firing interrupts that haven't been
>> + * fully configured yet will reach the CPU, so we don't need to care about
>> + * this special case.
>> + */
>> +void gicv3_do_LPI(unsigned int lpi)
>> +{
>> +    struct domain *d;
>> +    union host_lpi *hlpip, hlpi;
>> +    struct vcpu *vcpu;
>> +
>> +    /* EOI the LPI already. */
>> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
>> +
>> +    /* Find out if a guest mapped something to this physical LPI. */
>> +    hlpip = gic_get_host_lpi(lpi);
>> +    if ( !hlpip )
>> +        return;
>> +
>> +    hlpi.data = read_u64_atomic(&hlpip->data);
>> +
>> +    /*
>> +     * Unmapped events are marked with an invalid LPI ID. We can safely
>> +     * ignore them, as they have no further state and no-one can expect
>> +     * to see them if they have not been mapped.
>> +     */
>> +    if ( hlpi.virt_lpi == INVALID_LPI )
>> +        return;
>> +
>> +    d = rcu_lock_domain_by_id(hlpi.dom_id);
>> +    if ( !d )
>> +        return;
>> +
>> +    /* Make sure we don't step beyond the vcpu array. */
>> +    if ( hlpi.vcpu_id >= d->max_vcpus )
>> +    {
>> +        rcu_unlock_domain(d);
>> +        return;
>> +    }
>> +
>> +    vcpu = d->vcpu[hlpi.vcpu_id];
>> +
>> +    /* Check if the VCPU is ready to receive LPIs. */
>> +    if ( vcpu->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
>> +        vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
>> +
>> +    rcu_unlock_domain(d);
>> +}
>> +
>>  static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
>>  {
>>      uint64_t val;
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index a559e5e..63dbc21 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -1670,6 +1670,7 @@ static const struct gic_hw_operations gicv3_ops = {
>>      .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
>>      .make_hwdom_madt     = gicv3_make_hwdom_madt,
>>      .iomem_deny_access   = gicv3_iomem_deny_access,
>> +    .do_LPI              = gicv3_do_LPI,
>>  };
>>  
>>  static int __init gicv3_dt_preinit(struct dt_device_node *node, const void 
>> *data)
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index 9522c6c..a56be34 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -697,7 +697,13 @@ void gic_interrupt(struct cpu_user_regs *regs, int 
>> is_fiq)
>>              do_IRQ(regs, irq, is_fiq);
>>              local_irq_disable();
>>          }
>> -        else if (unlikely(irq < 16))
>> +        else if ( is_lpi(irq) )
>> +        {
>> +            local_irq_enable();
>> +            gic_hw_ops->do_LPI(irq);
>> +            local_irq_disable();
>> +        }
>> +        else if ( unlikely(irq < 16) )
>>          {
>>              do_sgi(regs, irq);
>>          }
>> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
>> index 3cad68f..5f4c5ad 100644
>> --- a/xen/arch/arm/vgic-v2.c
>> +++ b/xen/arch/arm/vgic-v2.c
>> @@ -710,11 +710,18 @@ static struct pending_irq 
>> *vgic_v2_lpi_to_pending(struct domain *d,
>>      BUG();
>>  }
>>  
>> +static int vgic_v2_lpi_get_priority(struct domain *d, unsigned int vlpi)
>> +{
>> +    /* Dummy function, no LPIs on a VGICv2. */
>> +    BUG();
>> +}
>> +
>>  static const struct vgic_ops vgic_v2_ops = {
>>      .vcpu_init   = vgic_v2_vcpu_init,
>>      .domain_init = vgic_v2_domain_init,
>>      .domain_free = vgic_v2_domain_free,
>>      .lpi_to_pending = vgic_v2_lpi_to_pending,
>> +    .lpi_get_priority = vgic_v2_lpi_get_priority,
>>      .max_vcpus = 8,
>>  };
>>  
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index 5128f13..2a14305 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -1548,12 +1548,24 @@ struct pending_irq *vgic_v3_lpi_to_pending(struct 
>> domain *d, unsigned int lpi)
>>      return pirq;
>>  }
>>  
>> +/* Retrieve the priority of an LPI from its struct pending_irq. */
>> +int vgic_v3_lpi_get_priority(struct domain *d, uint32_t vlpi)
>> +{
>> +    struct pending_irq *p = vgic_v3_lpi_to_pending(d, vlpi);
>> +
>> +    if ( !p )
>> +        return GIC_PRI_IRQ;
>> +
>> +    return p->lpi_priority;
>> +}
>> +
>>  static const struct vgic_ops v3_ops = {
>>      .vcpu_init   = vgic_v3_vcpu_init,
>>      .domain_init = vgic_v3_domain_init,
>>      .domain_free = vgic_v3_domain_free,
>>      .emulate_reg  = vgic_v3_emulate_reg,
>>      .lpi_to_pending = vgic_v3_lpi_to_pending,
>> +    .lpi_get_priority = vgic_v3_lpi_get_priority,
>>      /*
>>       * We use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU
>>       * that can be supported is up to 4096(==256*16) in theory.
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index d704d7c..cd9a2a5 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -226,10 +226,15 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, 
>> unsigned int virq)
>>  
>>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>>  {
>> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
>> +    struct vgic_irq_rank *rank;
>>      unsigned long flags;
>>      int priority;
>>  
>> +    /* LPIs don't have a rank, also store their priority separately. */
>> +    if ( is_lpi(virq) )
>> +        return v->domain->arch.vgic.handler->lpi_get_priority(v->domain, 
>> virq);
>> +
>> +    rank = vgic_rank_irq(v, virq);
>>      vgic_lock_rank(v, rank, flags);
>>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>>      vgic_unlock_rank(v, rank, flags);
>> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
>> index 836a103..42963c0 100644
>> --- a/xen/include/asm-arm/gic.h
>> +++ b/xen/include/asm-arm/gic.h
>> @@ -366,6 +366,8 @@ struct gic_hw_operations {
>>      int (*map_hwdom_extra_mappings)(struct domain *d);
>>      /* Deny access to GIC regions */
>>      int (*iomem_deny_access)(const struct domain *d);
>> +    /* Handle LPIs, which require special handling */
>> +    void (*do_LPI)(unsigned int lpi);
>>  };
>>  
>>  void register_gic_ops(const struct gic_hw_operations *ops);
>> diff --git a/xen/include/asm-arm/gic_v3_its.h 
>> b/xen/include/asm-arm/gic_v3_its.h
>> index 3b7c724..eb71fbd 100644
>> --- a/xen/include/asm-arm/gic_v3_its.h
>> +++ b/xen/include/asm-arm/gic_v3_its.h
>> @@ -139,6 +139,8 @@ void gicv3_its_dt_init(const struct dt_device_node 
>> *node);
>>  
>>  bool gicv3_its_host_has_its(void);
>>  
>> +void gicv3_do_LPI(unsigned int lpi);
>> +
>>  int gicv3_lpi_init_rdist(void __iomem * rdist_base);
>>  
>>  /* Initialize the host structures for LPIs and the host ITSes. */
>> @@ -182,6 +184,12 @@ static inline bool gicv3_its_host_has_its(void)
>>      return false;
>>  }
>>  
>> +static inline void gicv3_do_LPI(unsigned int lpi)
>> +{
>> +    /* We don't enable LPIs without an ITS. */
>> +    BUG();
>> +}
>> +
>>  static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base)
>>  {
>>      return -ENODEV;
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index 7c86f5b..08d6294 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -66,12 +66,14 @@ struct pending_irq
>>  #define GIC_IRQ_GUEST_VISIBLE  2
>>  #define GIC_IRQ_GUEST_ENABLED  3
>>  #define GIC_IRQ_GUEST_MIGRATING   4
>> +#define GIC_IRQ_GUEST_LPI_PENDING 5     /* Caches the pending bit of an 
>> LPI. */
> 
> I don't think we need GIC_IRQ_GUEST_LPI_PENDING, you can reuse
> GIC_IRQ_GUEST_QUEUED and list_empty(&n->inflight): if you call
> vgic_vcpu_inject_irq passing an irq as argument that is not enabled, it
> will get GIC_IRQ_GUEST_QUEUED and added to inflight, but not injected.

Mmh, that's an interesting suggestion.
The neat thing about piggy-backing on those status bits is that we can
use the atomic set_bit/clear_bit to set and clear the pending state,
independently from any code accessing the structure, even without taking
any lock. Especially since the existing VGIC code does not care about
this bit.
However looking at all users of GUEST_LPI_PENDING, I see that we either
already hold the VGIC lock and we are able to take it, so we might be
able to use the combination you suggested.

I tried to make some sense of what "inflight" *really* means, can you
shed some light on this?
My understanding is that it's about IRQs that should be injected, but
are not ready (because they are disabled). Are there other cases IRQs
are in the "inflight" list? Is that also the spillover in case all LRs
are used (in this case they would GIC_IRQ_GUEST_ENABLED)?

>>      unsigned long status;
>>      struct irq_desc *desc; /* only set it the irq corresponds to a physical 
>> irq */
>>      unsigned int irq;
>>  #define GIC_INVALID_LR         (uint8_t)~0
>>      uint8_t lr;
>>      uint8_t priority;
>> +    uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
> 
> The commit message says: "we enhance struct pending_irq to cache the
> pending bit and the priority information for LPIs, as we can't afford to
> walk the tables in guest memory every time we handle an incoming LPI." I
> thought it would be direct access, having the vlpi number in our hands?
> Why is it a problem?
> 
> If there has been a conversation about this that I am missing, please
> provide a link, I'll go back and read it.

Well, the property table is in guest memory as the other ITS tables and
we now access this in a new way (vgic_access_guest_memory()), which is
quite costly: we need to do the p2m lookup, map the page, access the
data, unmap the page and put the page back. Everything on itself is not
really too demanding (on arm64), but doing this in the interrupt
handling path to learn the priority value sounds a bit over the top.
For the *ITS* emulation (command handling) we can cope with this
overhead (because this is only needing during the virtual command
handling), but I wanted to avoid this in the generic GIC code, which is
also a hot path, as I understand.
Since the GICv3 architecture explicitly allows caching this information,
I wanted to use this opportunity.

Does that make sense?

Cheers,
Andre.

> 
>>      /* inflight is used to append instances of pending_irq to
>>       * vgic.inflight_irqs */
>>      struct list_head inflight;
>> @@ -136,6 +138,7 @@ struct vgic_ops {
>>      bool (*emulate_reg)(struct cpu_user_regs *regs, union hsr hsr);
>>      /* lookup the struct pending_irq for a given LPI interrupt */
>>      struct pending_irq *(*lpi_to_pending)(struct domain *d, unsigned int 
>> vlpi);
>> +    int (*lpi_get_priority)(struct domain *d, uint32_t vlpi);
>>      /* Maximum number of vCPU supported */
>>      const unsigned int max_vcpus;
>>  };
>> -- 
>> 2.8.2
>>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.