[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH] fix pgd_lock deadlock

To: Andrea Arcangeli <aarcange@xxxxxxxxxx>
From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Date: Tue, 15 Feb 2011 21:26:35 +0100 (CET)
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "Xen-devel@xxxxxxxxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxxxxxxxx>, Ian Campbell <Ian.Campbell@xxxxxxxxxx>, the arch/x86 maintainers <x86@xxxxxxxxxx>, Hugh Dickins <hughd@xxxxxxxxxx>, Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>, Andi Kleen <ak@xxxxxxx>, Johannes Weiner <jweiner@xxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Larry Woodman <lwoodman@xxxxxxxxxx>
Delivery-date: Tue, 15 Feb 2011 12:28:21 -0800
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Tue, 15 Feb 2011, Thomas Gleixner wrote:

> On Tue, 15 Feb 2011, Andrea Arcangeli wrote:
> > On Tue, Feb 15, 2011 at 08:26:51PM +0100, Thomas Gleixner wrote:
> > 
> > With NR_CPUs < 4, or with THP enabled, rmap.c will do
> > spin_lock(&mm->page_table_lock) (or pte_offset_map_lock where the lock
> > is still mm->page_table_lock and not the PT lock). Then it will send
> > IPIs to flush the tlb of the other CPUs.
> > 
> > But the other CPU is running the vmalloc_sync_all, and it is trying to
> > take the page_table_lock with irq disabled. It will never take the
> > lock because the CPU waiting the IPI delivery holds it. And it will
> > never run the IPI because it has irqs disabled.
> 
> Ok, that makes sense :)
>  
> > Now the big question is if anything is taking the pgd_lock from
> > irqs. Normal testing could never reveal it as even if it happens it
> > has a slim chance to happen while the pgd_lock is already hold by
> > normal kernel context. But the VM_BUG_ON(in_interrupt()) should
> > hopefully have revealed it already if it ever happened, I hope.
> > 
> > Clearly we could try to fix it in other ways, but still if there's no
> > reason to do the _irqsave this sounds a good idea to apply my fix
> > anyway.
> 
> Did you try with DEBUG_PAGEALLOC, which is calling into cpa quite a
> lot?

Another thing. You check for in_interrupt(), but what makes sure that
the code which takes pgd_lock is never taken with interrupts disabled
except during early boot ?

Thanks,

        tglx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
  - From: Andrea Arcangeli

References:
- [Xen-devel] Re: [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync
  - From: Andrea Arcangeli
- [Xen-devel] Re: [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync
  - From: Jeremy Fitzhardinge
- [Xen-devel] Re: [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync
  - From: Andrea Arcangeli
- [Xen-devel] Re: [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync
  - From: Jeremy Fitzhardinge
- [Xen-devel] Re: [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync
  - From: Andrea Arcangeli
- [Xen-devel] [PATCH] fix pgd_lock deadlock
  - From: Andrea Arcangeli
- [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
  - From: Thomas Gleixner
- [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
  - From: Andrea Arcangeli
- [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
  - From: Thomas Gleixner

Prev by Date: [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
Next by Date: [Xen-devel] Re: [PATCH v2] xen network backend driver
Previous by thread: [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
Next by thread: [Xen-devel] Re: [PATCH] fix pgd_lock deadlock
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.