[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] arm/acpi: Fix the deadlock in function vgic_lock_rank()
(CC Wei Liu) Hi Stefano, On 30/05/2016 14:16, Stefano Stabellini wrote: On Fri, 27 May 2016, Julien Grall wrote:Hello Shanker, On 27/05/16 01:39, Shanker Donthineni wrote:Commit 9d77b3c01d1261c (Configure SPI interrupt type and route to Dom0 dynamically) causing dead loop inside the spinlock function. Note that spinlocks in XEN are not recursive. Re-acquiring a spinlock that has already held by calling CPU leads to deadlock. This happens whenever dom0 does writes to GICD regs ISENABLER/ICENABLER.Thank you for spotting it, I have not noticed it while I was reviewing, only tested on a model without any SPIs.The following call trace explains the problem. DOM0 writes GICD_ISENABLER/GICD_ICENABLER vgic_v3_distr_common_mmio_write() vgic_lock_rank() --> acquiring first time vgic_enable_irqs() route_irq_to_guest() gic_route_irq_to_guest() vgic_get_target_vcpu() vgic_lock_rank() --> attemping acquired lock The simple fix release spinlock before calling vgic_enable_irqs() and vgic_disable_irqs().You should explain why you think it is valid to release the lock earlier. In this case, I think the fix is not correct because the lock is protecting both the register value and the internal state in Xen (modified by vgic_enable_irqs). By releasing the lock earlier, they may become inconsistent if another vCPU is disabling the IRQs at the same time.I agree, the vgic_enable_irqs call need to stay within the vgic_lock_rank/vgic_unlock_rank region.I cannot find an easy fix which does not involve release the lock. When I was reviewing this patch, I suggested to split the IRQ configuration from the routing.Yes, the routing doesn't need to be done from vgic_enable_irqs. It is not nice. That would be the ideal fix, but it is not trivial. For 4.7 we could consider reverting 9d77b3c01d1261c. The only other thing that I can come up with which is simple would be improving gic_route_irq_to_guest to cope with callers that have the vgic rank lock already held (see below, untested) but it's pretty ugly. We are close to release Xen 4.7, so I think we should avoid to touch the common interrupt code (i.e not only used by ACPI). ACPI can only be enabled in expert mode and will be a tech-preview for Xen 4.7. So I would revert the patch. SPIs will not be routed, but it is better than a deadlock. I would also replace the patch with a warning until the issue will be fixed in Xen 4.8. Any opinions? +int gic_route_irq_to_guest(struct domain *d, unsigned int virq, + struct irq_desc *desc, unsigned int priority) +{ + unsigned long flags; + int lock = 0, retval; + struct vgic_irq_rank *rank; + + /* Use vcpu0 to retrieve the pending_irq struct. Given that we only + * route SPIs to guests, it doesn't make any difference. */ + rank = vgic_rank_irq(d->vcpu[0], virq); + + /* Take the rank spinlock unless it has already been taken by the + * caller. */ + if ( !spin_is_locked(&rank->lock) ) { AFAICT, spin_is_locked only tell us that someone has locked the rank. So this would be unsafe. Regards, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |