Xen project Mailing List

Re: [Xen-devel] Strange PVM spinlock case revisited (nailed down)

From: Stefan Bader <stefan.bader@xxxxxxxxxxxxx>

Date: Thu, 14 Feb 2013 15:21:11 +0100

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Thu, 14 Feb 2013 14:21:45 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 14.02.2013 12:43, Jan Beulich wrote: >>>> On 14.02.13 at 12:04, Stefan Bader <stefan.bader@xxxxxxxxxxxxx> wrote: >> There was a bit more on the stack but try_to_wake_up seemed a believable >> current >> path. Even more so after looking into the function: >> >> #ifdef CONFIG_SMP >> /* >> * If the owning (remote) cpu is still in the middle of schedule() >> with >> * this task as prev, wait until its done referencing the task. >> */ >> while (p->on_cpu) { >> #ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW >> /* >> * In case the architecture enables interrupts in >> * context_switch(), we cannot busy wait, since that >> * would lead to deadlocks when an interrupt hits and >> * tries to wake up @prev. So bail and do a complete >> * remote wakeup. >> */ >> if (ttwu_activate_remote(p, wake_flags)) >> goto stat; >> #else >> cpu_relax(); >> #endif >> >> And prying out the task in question from the stack, it was one currently >> being accounted for VCPU 6 and on_cpu. VCPU 6 is one of the other waiters >> for >> the runq lock of VCPU 1. Which would get picked up as soon as this callback >> is >> done. Unfortunately we "stay awhile... stay forever!". > > When analyzing a similar problem with our old PV ticket lock > implementation (and the interrupt re-enabling there when possible > before going into polling mode), it was precisely this open coded > lock construct that caused problems for us. Back then I didn't, > however, realize that this would even affect the simplistic byte > locks used by the pv-ops Xen code (or else I would have pointed > this out). > > Being relatively certain that with our new implementation we don't > have any such problems, I can only recommend against dropping > the re-enabling of interrupts - this needlessly introduces high > interrupt latencies in a broader range of cases. Instead, the > interactions ought to be fixed properly. And it's time for using > ticket locks in the Xen code too... > Not sure what *your new* implementation is. ;) I am more concerned about the currently released/stable kernels which potentially have this problem. I agree that a proper solution is preferable. Ticket locks likely have a much lower chance of hitting this as part of the current problem is that lower number VCPUs are preferred in unlock_slow. In the end, though, we would need something that could go upstream (and from there back into stable). So it should not be too complicated. Then a proper solution can be applied. Having more understanding now, I had a different idea. Not sure this is foolproof and surely is causing some more active spinning. But it looks like I also can keep the system from locking up (v3.2 kernel and the testcase) if I change xen_spin_unlock_slow to send the wakeup IPI to _all_ spinners instead of only the first one found. -Stefan

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.