[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support

To: Waiman Long <waiman.long@xxxxxx>
From: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Date: Fri, 14 Mar 2014 10:44:54 +0100
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>, Gleb Natapov <gleb@xxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, Andi Kleen <andi@xxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Michel Lespinasse <walken@xxxxxxxxxx>, Alok Kataria <akataria@xxxxxxxxxx>, linux-arch@xxxxxxxxxxxxxxx, kvm@xxxxxxxxxxxxxxx, x86@xxxxxxxxxx, Ingo Molnar <mingo@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, Scott J Norton <scott.norton@xxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Chris Wright <chrisw@xxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Oleg Nesterov <oleg@xxxxxxxxxx>, David Vrabel <david.vrabel@xxxxxxxxxx>
Delivery-date: Fri, 14 Mar 2014 09:45:25 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

Il 13/03/2014 20:49, Waiman Long ha scritto:
> On 03/13/2014 09:57 AM, Paolo Bonzini wrote:
>> Il 13/03/2014 12:21, David Vrabel ha scritto:
>>> On 12/03/14 18:54, Waiman Long wrote:
>>>> This patch adds para-virtualization support to the queue spinlock in
>>>> the same way as was done in the PV ticket lock code. In essence, the
>>>> lock waiters will spin for a specified number of times (QSPIN_THRESHOLD
>>>> = 2^14) and then halted itself. The queue head waiter will spins
>>>> 2*QSPIN_THRESHOLD times before halting itself. When it has spinned
>>>> QSPIN_THRESHOLD times, the queue head will assume that the lock
>>>> holder may be scheduled out and attempt to kick the lock holder CPU
>>>> if it has the CPU number on hand.
>>>
>>> I don't really understand the reasoning for kicking the lock holder.
>>
>> I agree.  If the lock holder isn't running, there's probably a good
>> reason for that and going to sleep will not necessarily convince the
>> scheduler to give more CPU to the lock holder.  I think there are two
>> choices:
>>
>> 1) use yield_to to donate part of the waiter's quantum to the lock
>> holder?    For this we probably need a new, separate hypercall
>> interface.  For KVM it would be the same as hlt in the guest but with
>> an additional yield_to in the host.
>>
>> 2) do nothing, just go to sleep.
>>
>> Could you get (or do you have) numbers for (2)?
> 
> I will take out the lock holder kick portion from the patch. I will also
> try to collect more test data.
> 
>>
>> More important, I think a barrier is missing:
>>
>>     Lock holder ---------------------------------------
>>
>>     // queue_spin_unlock
>>     barrier();
>>     ACCESS_ONCE(qlock->lock) = 0;
>>     barrier();
>>
> 
> This is not the unlock code that is used when PV spinlock is enabled.

It is __queue_spin_unlock.  But you're right:

>         if (static_key_false(&paravirt_spinlocks_enabled)) {
>                 /*
>                  * Need to atomically clear the lock byte to avoid racing with
>                  * queue head waiter trying to set _QSPINLOCK_LOCKED_SLOWPATH.
>                  */
>                 if (likely(cmpxchg(&qlock->lock, _QSPINLOCK_LOCKED, 0)
>                                 == _QSPINLOCK_LOCKED))
>                         return;
>                 else
>                         queue_spin_unlock_slowpath(lock);
> 
>         } else {
>                 __queue_spin_unlock(lock);
>         }

... indeed the __queue_spin_unlock/pv_kick_node pair is only done if the
waiter has already written _QSPINLOCK_LOCKED_SLOWPATH, and this means
that the lock holder must also observe PV_CPU_HALTED.

So this is correct:

>> Nothing protects from writing qlock->lock before pv->cpustate is read,

but this cannot happen:

>> leading to this:
>>
>>     Lock holder            Waiter
>>     ---------------------------------------------------------------
>>     read pv->cpustate
>>         (it is PV_CPU_ACTIVE)
>>                     pv->cpustate = PV_CPU_HALTED
>>                     lockval = cmpxchg(...)
>>                     hibernate()
>>     qlock->lock = 0
>>     if (pv->cpustate != PV_CPU_HALTED)
>>         return;
>>
> 
> The lock holder will read cpustate only if the lock byte has been
> changed to _QSPINLOCK_LOCKED_SLOWPATH. So the setting of the lock byte
> synchronize the 2 threads.

Yes.

> The only thing that I am not certain is when
> the waiter is trying to go to sleep while, at the same time, the lock
> holder is trying to kick it. Will there be a missed wakeup because of
> this timing issue?

This is okay.  The kick_cpu hypercall is sticky until the next halt, if
no halt is pending.  Otherwise, pv ticketlocks would have the same issue.

Paolo

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- [Xen-devel] [PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support
  - From: Waiman Long
- [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
  - From: Waiman Long
- Re: [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
  - From: David Vrabel
- Re: [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
  - From: Paolo Bonzini
- Re: [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
  - From: Waiman Long

Prev by Date: [Xen-devel] [xen-unstable test] 25454: regressions - FAIL
Next by Date: Re: [Xen-devel] [RFC PATCH v2 3/3] tools, libxl: handle the iomem parameter with the memory_mapping hcall
Previous by thread: Re: [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
Next by thread: Re: [Xen-devel] [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.