Xen project Mailing List

Re: [Xen-devel] [PATCH v3 1/2] IOMMU/spinlock: Fix a bug found in AMD IOMMU initialization

To: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Date: Fri, 11 Mar 2016 09:41:23 -0500

Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, "Xu, Quan" <quan.xu@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Fri, 11 Mar 2016 14:41:40 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi Quan and Dario, On Fri, Mar 11, 2016 at 5:35 AM, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote: > On Fri, 2016-03-11 at 06:54 +0000, Xu, Quan wrote: >> On March 11, 2016 11:25am, <mengxu@xxxxxxxxxxxxx> wrote: >> > >> > On Wed, Mar 9, 2016 at 8:17 AM, Quan Xu <quan.xu@xxxxxxxxx> wrote: >> > > >> > > pcidevs_lock should be held with interrupt enabled. However there >> > > remains an exception in AMD IOMMU code, where the lock is >> > > acquired >> > > with interrupt disabled. This inconsistency might lead to >> > > deadlock. >> > Why will this inconsistency lead to deadlock? >> > I understand the difference between spin_lock_irqsave(), which >> > disable interrupt, >> > and spin_lock(), which allows interrupt, however, I didn't get why >> > misuse the >> > spin_lock_irqsave() at the place of spin_lock() could potentially >> > lead to a >> > deadlock? >> >> 1).As Jan mentioned, The implication from disabling interrupts while >> acquiring a lock is that the lock is also being acquired by some >> interrupt handler. >> If you mix acquire types, the one not disabling interrupts is prone >> to be interrupted, and the interrupt trying to get hold of the lock >> the same CPU already owns. >> > The key issue is whether or not a lock can be acquired from IRQ context > (i.e., in an interrupt handler, or a function called by that, as a > result of an interrupt occurring). > > For lock that can, IRQ disabling variants must be used *everywhere* the > lock is taken (so, e.g., not only when it is actually taken from IRQ > context, just *always*!). > > If that rule is not honored, we are ok if the lock is taken on CPUA, > and the interrupt handled on CPUB: > > CPUA CPUB > . . > . . > spin_lock(l) . > . . > . . <-- Interrupt! > . . > . spin_lock(l); //spins on l > . x //spins on l > . x //spins on l > spin_unlock(l) . //takes l > . . > . spin_unlock(l); > . . <-- . > . . > > but the following can happen, if the interrupt is delivered to the CPU > that has already taken the lock: > > CPU > . > . > [1] spin_lock(l); //takes l > . > . <-- Interrupt! > . > spin_lock(l) // CPU deadlocks > > If, in the latter case, spin_lock_irq(l) were used at [1], things would > have been fine, as the interrupt wouldn't have occurred until l weren't > released: > > CPU > . > . > spin_lock_irq(l) //takes l > . > . <---------------- Interrupt! > . - //IRQs are disabled > . - //IRQs are disabled > . - //IRQs are disabled > spin_unlock_irq(l) . //IRQs re-hanbled > spin_lock_irq(l); //takes l > . > . > spin_unlock_irq(l); > . <----------------- . > . > > Note that whether spin_lock_irq() or spin_lock_irqsave() should be > used, is completely independent from this, and it must be decided > according to whether IRQs are disabled already or not when taking the > lock. And (quite obviously, but since wre're here) it is fine to mix > spin_lock_irq() and spin_lock_irqsave(), as they both disable > interrupts, which is what matters. > > So, now, if we know for sure that a lock is _never_ever_ever_ taken > from interrupt context, can we mix spin_lock() and spin_lock_irq() on > it (for whatever reason)? Well, as far as the above reasoning is > concerned, yes. > > In fact, the deadlock arises because IRQs interrupt asynchronously what > the CPU is doing, and that can happen when the CPU has taken the lock > already. But if the 'asynchronous' part goes away, we really don't care > whether a lock is take at time t1 with IRQ enabled, and at time t2 with > IRQ disabled, don't you think? > > Well, here it is where what the comment inside check_lock() describes > comes into play. As quoted by Qaun already: > >> 2). Comment inside check_lock(), >> we partition locks into IRQ-safe (always held with IRQs disabled) and >> IRQ-unsafe (always held with IRQs enabled) types. The convention for >> every lock must be consistently observed else we can deadlock in >> IRQ-context rendezvous functions (__a rendezvous which gets every CPU >> into IRQ context before any CPU is released from the rendezvous__). >> If we can mix IRQ-disabled and IRQ-enabled callers, the following can >> happen: >> * Lock is held by CPU A, with IRQs enabled >> * CPU B is spinning on same lock, with IRQs disabled >> * Rendezvous starts -- CPU A takes interrupt and enters rendezbous >> spin >> * DEADLOCK -- CPU B will never enter rendezvous, CPU A will never >> exit >> the rendezvous, and will hence never release the lock. >> >> To guard against this subtle bug we latch the IRQ safety of every >> spinlock in the system, on first use. >> > This is a very good safety measure, but it can be quite a restriction > in some (luckily limited) cases. And that's why it is possible -- > although absolutely discouraged-- to relax it. See, for an example of > this, the comment in start_secondary(), in xen/arch/x86/smpboot.c: > > ... > /* > * Just as during early bootstrap, it is convenient here to disable > * spinlock checking while we have IRQs disabled. This allows us to > * acquire IRQ-unsafe locks when it would otherwise be disallowed. > * > * It is safe because the race we are usually trying to avoid involves > * a group of CPUs rendezvousing in an IPI handler, where one cannot > * join because it is spinning with IRQs disabled waiting to acquire a > * lock held by another in the rendezvous group (the lock must be an > * IRQ-unsafe lock since the CPU took the IPI after acquiring it, and > * hence had IRQs enabled). This is a deadlock scenario. > * > * However, no CPU can be involved in rendezvous until it is online, > * hence no such group can be waiting for this CPU until it is > * visible in cpu_online_map. Hence such a deadlock is not possible. > */ > spin_debug_disable(); > ... > > Just FTR, at least as far as I can remember, Linux does not enforce > anything like that. > > Hope this clarifies things. Thank you very much for your explanation and education! They are really helpful! :-D Let me summarize: ;-) | | spin_lock | spin_lock_irq | spin_lock_irqsave | Can run in irq context? | No | Yes | Yes | Can run in irq disabled context? | No | No | Yes Why deadlock may occur if we mix the spin_lock and spin_lock_irq(save)? If we mix the spin_lock and spin_lock_irq(save), and a group of CPUs rendezvousing in an IPI handler, we will have deadlock. Because the CPU A that takes spin_lock will wait for the rendezvousing condition to be satisfied, while the CPU B that takes th spin_lock_irq(save) will not enter into the rendezvousing condition (since it disable the interrupt). Then, CPU A waits for CPU B to handle the IPI to get out of the rendezvousing condition (kind of synchrous point), which won't until it gets the spin_lock. CPU B waits for CPU A to release the spin_lock, which won't until it get out of the rendezvousing condition; Are my understanding and summary correct? Thanks and Best Regards, Meng ----------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.