[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] arm/acpi: Fix the deadlock in function vgic_lock_rank()



On Fri, 27 May 2016, Julien Grall wrote:
> Hello Shanker,
> 
> On 27/05/16 01:39, Shanker Donthineni wrote:
> > Commit 9d77b3c01d1261c (Configure SPI interrupt type and route to
> > Dom0 dynamically) causing dead loop inside the spinlock function.
> > Note that spinlocks in XEN are not recursive. Re-acquiring a spinlock
> > that has already held by calling CPU leads to deadlock. This happens
> > whenever dom0 does writes to GICD regs ISENABLER/ICENABLER.
> 
> Thank you for spotting it, I have not noticed it while I was  reviewing, only
> tested on a model without any SPIs.
> 
> > The following call trace explains the problem.
> > 
> > DOM0 writes GICD_ISENABLER/GICD_ICENABLER
> >    vgic_v3_distr_common_mmio_write()
> >      vgic_lock_rank()  -->  acquiring first time
> >        vgic_enable_irqs()
> >          route_irq_to_guest()
> >            gic_route_irq_to_guest()
> >              vgic_get_target_vcpu()
> >                vgic_lock_rank()  -->  attemping acquired lock
> > 
> > The simple fix release spinlock before calling vgic_enable_irqs()
> > and vgic_disable_irqs().
> 
> You should explain why you think it is valid to release the lock earlier.
> 
> In this case, I think the fix is not correct because the lock is protecting
> both the register value and the internal state in Xen (modified by
> vgic_enable_irqs). By releasing the lock earlier, they may become inconsistent
> if another vCPU is disabling the IRQs at the same time.

I agree, the vgic_enable_irqs call need to stay within the
vgic_lock_rank/vgic_unlock_rank region.


> I cannot find an easy fix which does not involve release the lock. When I was
> reviewing this patch, I suggested to split the IRQ configuration from the
> routing.

Yes, the routing doesn't need to be done from vgic_enable_irqs. It is
not nice. That would be the ideal fix, but it is not trivial.

For 4.7 we could consider reverting 9d77b3c01d1261c. The only other
thing that I can come up with which is simple would be improving
gic_route_irq_to_guest to cope with callers that have the vgic rank lock
already held (see below, untested) but it's pretty ugly.



diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 2bfe4de..57f3f3f 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -127,15 +127,12 @@ void gic_route_irq_to_xen(struct irq_desc *desc, const 
cpumask_t *cpu_mask,
 
 /* Program the GIC to route an interrupt to a guest
  *   - desc.lock must be held
+ *   - rank lock must be held
  */
-int gic_route_irq_to_guest(struct domain *d, unsigned int virq,
+static int __gic_route_irq_to_guest(struct domain *d, unsigned int virq,
                            struct irq_desc *desc, unsigned int priority)
 {
-    unsigned long flags;
-    /* Use vcpu0 to retrieve the pending_irq struct. Given that we only
-     * route SPIs to guests, it doesn't make any difference. */
-    struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq);
-    struct vgic_irq_rank *rank = vgic_rank_irq(v_target, virq);
+    struct vcpu *v_target = __vgic_get_target_vcpu(d->vcpu[0], virq);
     struct pending_irq *p = irq_to_pending(v_target, virq);
     int res = -EBUSY;
 
@@ -144,12 +141,10 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int 
virq,
     ASSERT(virq >= 32);
     ASSERT(virq < vgic_num_irqs(d));
 
-    vgic_lock_rank(v_target, rank, flags);
-
     if ( p->desc ||
          /* The VIRQ should not be already enabled by the guest */
          test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) )
-        goto out;
+        return res;
 
     desc->handler = gic_hw_ops->gic_guest_irq_type;
     set_bit(_IRQ_GUEST, &desc->status);
@@ -159,12 +154,36 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int 
virq,
     p->desc = desc;
     res = 0;
 
-out:
-    vgic_unlock_rank(v_target, rank, flags);
-
     return res;
 }
 
+int gic_route_irq_to_guest(struct domain *d, unsigned int virq,
+                           struct irq_desc *desc, unsigned int priority)
+{
+    unsigned long flags;
+    int lock = 0, retval;
+    struct vgic_irq_rank *rank;
+
+    /* Use vcpu0 to retrieve the pending_irq struct. Given that we only
+     * route SPIs to guests, it doesn't make any difference. */
+    rank = vgic_rank_irq(d->vcpu[0], virq);
+
+    /* Take the rank spinlock unless it has already been taken by the
+     * caller. */
+    if ( !spin_is_locked(&rank->lock) ) {
+        vgic_lock_rank(d->vcpu[0], rank, flags);
+        lock = 1;
+    }
+
+    retval = __gic_route_irq_to_guest(d, virq, desc, GIC_PRI_IRQ);
+
+    if ( lock )
+        vgic_unlock_rank(d->vcpu[0], rank, flags);
+
+    return retval;
+
+}
+
 /* This function only works with SPIs for now */
 int gic_remove_irq_from_guest(struct domain *d, unsigned int virq,
                               struct irq_desc *desc)
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index aa420bb..e45669f 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -215,7 +215,7 @@ int vcpu_vgic_free(struct vcpu *v)
 }
 
 /* The function should be called by rank lock taken. */
-static struct vcpu *__vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
+struct vcpu *__vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
 {
     struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
 
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index a2fccc0..726e690 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -343,6 +343,7 @@ void vgic_v3_setup_hw(paddr_t dbase,
                       const struct rdist_region *regions,
                       uint32_t rdist_stride);
 #endif
+struct vcpu *__vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
 
 #endif /* __ASM_ARM_VGIC_H__ */
 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.