[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks

To: Waiman Long <waiman.long@xxxxxx>
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Thu, 13 Mar 2014 14:57:01 +0100
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>, kvm@xxxxxxxxxxxxxxx, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, Andi Kleen <andi@xxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Michel Lespinasse <walken@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, linux-arch@xxxxxxxxxxxxxxx, Gleb Natapov <gleb@xxxxxxxxxx>, x86@xxxxxxxxxx, Ingo Molnar <mingo@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, Scott J Norton <scott.norton@xxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Chris Wright <chrisw@xxxxxxxxxxxx>, Oleg Nesterov <oleg@xxxxxxxxxx>, Alok Kataria <akataria@xxxxxxxxxx>, Aswin Chandramouleeswaran <aswin@xxxxxx>, Chegu Vinod <chegu_vinod@xxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, David Vrabel <david.vrabel@xxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 13 Mar 2014 13:57:37 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, Mar 12, 2014 at 03:08:24PM -0400, Waiman Long wrote:
> On 03/12/2014 02:54 PM, Waiman Long wrote:
> >+            /*
> >+             * Set the lock bit&  clear the waiting bit simultaneously
> >+             * It is assumed that there is no lock stealing with this
> >+             * quick path active.
> >+             *
> >+             * A direct memory store of _QSPINLOCK_LOCKED into the
> >+             * lock_wait field causes problem with the lockref code, e.g.
> >+             *   ACCESS_ONCE(qlock->lock_wait) = _QSPINLOCK_LOCKED;
> >+             *
> >+             * It is not currently clear why this happens. A workaround
> >+             * is to use atomic instruction to store the new value.
> >+             */
> >+            {
> >+                    u16 lw = xchg(&qlock->lock_wait, _QSPINLOCK_LOCKED);
> >+                    BUG_ON(lw != _QSPINLOCK_WAITING);
> >+            }

> It was found that when I used a direct memory store instead of an atomic op,
> the following kernel crash might happen at filesystem dismount time:
> 
> [ 1529.936714] Call Trace:
> [ 1529.936714]  [<ffffffff811c2d03>] d_walk+0xc3/0x260
> [ 1529.936714]  [<ffffffff811c1770>] ? check_and_collect+0x30/0x30
> [ 1529.936714]  [<ffffffff811c3985>] shrink_dcache_for_umount+0x75/0x120
> [ 1529.936714]  [<ffffffff811adf21>] generic_shutdown_super+0x21/0xf0
> [ 1529.936714]  [<ffffffff811ae207>] kill_block_super+0x27/0x70
> [ 1529.936714]  [<ffffffff811ae4ed>] deactivate_locked_super+0x3d/0x60
> [ 1529.936714]  [<ffffffff811aea96>] deactivate_super+0x46/0x60
> [ 1529.936714]  [<ffffffff811ca277>] mntput_no_expire+0xa7/0x140
> [ 1529.936714]  [<ffffffff811cb6ce>] SyS_umount+0x8e/0x100
> [ 1529.936714]  [<ffffffff815d2c29>] system_call_fastpath+0x16/0x1b

> It was more readily reproducible in a KVM guest. It was harder to reproduce
> in a bare metal machine, but kernel crash still happened after several
> tries.
> 
> I am not sure what exactly cause this crash, but it will have something to
> do with the interaction between the lockref and the qspinlock code. I would
> like more eyes on that to find the root cause of it.

I cannot reproduce with my series that has the one word write.

What I did was I made my swap partition (who needs that anyway on a
machine with 16G of memory) into an XFS partition.

Then I copied my linux.git onto it and unmounted.

I'll try a few more times; the above trace seems to suggest it happens
during dcache cleanup, so I suppose I should read the filesystem some
and unmount again.

Is there anything specific you did to make it go bang?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
  - From: Waiman Long

References:
- [Xen-devel] [PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support
  - From: Waiman Long
- [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
  - From: Waiman Long
- Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
  - From: Waiman Long

Prev by Date: Re: [Xen-devel] [RFC 08/14] xen/xsm: flask: Rename variable "bool" in "b"
Next by Date: Re: [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
Previous by thread: Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
Next by thread: Re: [Xen-devel] [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.