[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Strange PVM spinlock case revisited (nailed down)



On 14.02.2013 12:43, Jan Beulich wrote:
>>>> On 14.02.13 at 12:04, Stefan Bader <stefan.bader@xxxxxxxxxxxxx> wrote:
>> There was a bit more on the stack but try_to_wake_up seemed a believable 
>> current
>> path. Even more so after looking into the function:
>>
>> #ifdef CONFIG_SMP
>>         /*
>>          * If the owning (remote) cpu is still in the middle of schedule() 
>> with
>>          * this task as prev, wait until its done referencing the task.
>>          */
>>         while (p->on_cpu) {
>> #ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
>>                 /*
>>                  * In case the architecture enables interrupts in
>>                  * context_switch(), we cannot busy wait, since that
>>                  * would lead to deadlocks when an interrupt hits and
>>                  * tries to wake up @prev. So bail and do a complete
>>                  * remote wakeup.
>>                  */
>>                 if (ttwu_activate_remote(p, wake_flags))
>>                         goto stat;
>> #else
>>                 cpu_relax();
>> #endif
>>
>> And prying out the task in question from the stack, it was one currently
>> being accounted for VCPU 6 and on_cpu. VCPU 6 is one of the other waiters 
>> for
>> the runq lock of VCPU 1. Which would get picked up as soon as this callback 
>> is
>> done. Unfortunately we "stay awhile... stay forever!".
> 
> When analyzing a similar problem with our old PV ticket lock
> implementation (and the interrupt re-enabling there when possible
> before going into polling mode), it was precisely this open coded
> lock construct that caused problems for us. Back then I didn't,
> however, realize that this would even affect the simplistic byte
> locks used by the pv-ops Xen code (or else I would have pointed
> this out).
> 
> Being relatively certain that with our new implementation we don't
> have any such problems, I can only recommend against dropping
> the re-enabling of interrupts - this needlessly introduces high
> interrupt latencies in a broader range of cases. Instead, the
> interactions ought to be fixed properly. And it's time for using
> ticket locks in the Xen code too...
> 
Not sure what *your new* implementation is. ;) I am more concerned about the
currently released/stable kernels which potentially have this problem. I agree
that a proper solution is preferable. Ticket locks likely have a much lower
chance of hitting this as part of the current problem is that lower number VCPUs
are preferred in unlock_slow.
In the end, though, we would need something that could go upstream (and from
there back into stable). So it should not be too complicated. Then a proper
solution can be applied.
Having more understanding now, I had a different idea. Not sure this is
foolproof and surely is causing some more active spinning. But it looks like I
also can keep the system from locking up (v3.2 kernel and the testcase) if I
change xen_spin_unlock_slow to send the wakeup IPI to _all_ spinners instead of
only the first one found.

-Stefan

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.