[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
On Tue, Feb 15, 2011 at 08:26:51PM +0100, Thomas Gleixner wrote: > On Tue, 15 Feb 2011, Andrea Arcangeli wrote: > > > Hello, > > > > Without this patch we can deadlock in the page_table_lock with NR_CPUS > > < 4 or THP on, with this patch we hopefully won't deadlock in the > > pgd_lock (if taken from irq). I can't see anything taking it from irq > > (maybe aio? to check I also tried the libaio testuite with no apparent > > VM_BUG_ON triggering), so unless somebody sees it, I think we should > > apply it. I've been running for a while with this patch applied > > without apparent problems. Other archs may follow suit if it's proven > > that there's nothing taking the pgd_lock from irq. > > > > === > > Subject: fix pgd_lock deadlock > > > > From: Andrea Arcangeli <aarcange@xxxxxxxxxx> > > > > It's forbidden to take the page_table_lock with the irq disabled or if > > there's > > contention the IPIs (for tlb flushes) sent with the page_table_lock held > > will > > never run leading to a deadlock. > > I really read this thing 5 times and still cannot make any sense of it. > > You talk about page_table_lock and then fiddle with pgd_lock. > > -ENOSENSE With NR_CPUs < 4, or with THP enabled, rmap.c will do spin_lock(&mm->page_table_lock) (or pte_offset_map_lock where the lock is still mm->page_table_lock and not the PT lock). Then it will send IPIs to flush the tlb of the other CPUs. But the other CPU is running the vmalloc_sync_all, and it is trying to take the page_table_lock with irq disabled. It will never take the lock because the CPU waiting the IPI delivery holds it. And it will never run the IPI because it has irqs disabled. Now the big question is if anything is taking the pgd_lock from irqs. Normal testing could never reveal it as even if it happens it has a slim chance to happen while the pgd_lock is already hold by normal kernel context. But the VM_BUG_ON(in_interrupt()) should hopefully have revealed it already if it ever happened, I hope. Clearly we could try to fix it in other ways, but still if there's no reason to do the _irqsave this sounds a good idea to apply my fix anyway. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |