Xen project Mailing List

Re: [Xen-devel] Strange PVM spinlock case revisited

To: Stefan Bader <stefan.bader@xxxxxxxxxxxxx>

From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>

Date: Tue, 12 Feb 2013 15:07:21 +0000

Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Tue, 12 Feb 2013 15:07:43 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, 2013-02-12 at 14:50 +0000, Stefan Bader wrote: > On 12.02.2013 15:07, Ian Campbell wrote: > > On Tue, 2013-02-12 at 13:42 +0000, Stefan Bader wrote: > >> On 11.02.2013 18:29, Ian Campbell wrote: > >>> On Fri, 2013-02-08 at 15:33 +0000, Stefan Bader wrote: > >>>> A while ago I had been reporting an issue which causes Xen PVM guests > >>>> to hang (apparently related to spinlocks). Since then I was able to > >>>> gather a few more facts which I try to provide below. For the real > >>>> reasons, I am still puzzled and would be thankful for any hints or > >>>> direction. > >>>> > >>>> - Happens with Xen 4.1.2 and Xen 4.2 host-side. > >>>> - PVM guest with 8 VCPUs is running 800 threads doing DB updates. > >>> > >>> This is on a host with >= 8 PCPUs, correct? > >> > >> Maybe >=4 but I cannot remember for sure anymore. > > > > OK, so the important point I was getting at was whether the guest was > > overcommitting the host or not, it seems like it is (2xVCPUs for each > > PCPU) > > Err not so much. I run on an 8-core host. ;) Um, which is what I asked. What is >= 4 then? > > > >>> I suppose xen_spin_unlock could also be instrumented to indicate who it > >>> last kicked and for which lock and that might show us something? > >> > >> IIRC I had done this but it did not really show much. What I have done in > >> the > >> meantime was to instrument the IRQ service functions in > >> arch/x86/kernel/entry_64.S to give me a history of callbacks. This shows > >> (in > >> another experiment where it is 1,2,5,6 in the same lock and 2,5,6 still > >> polling) > >> that the last events on the vcpu not polling are: > >> > >> xen_do_hypervisor_callback+127 > >> call_softirq+29 > >> call_softirq+125 > >> call_softirq+29 > >> call_softirq+125 > >> call_softirq+29 > >> call_softirq+125 > >> xen_do_hypervisor_callback+28 > >> call_softirq+29 > >> xen_do_hypervisor_callback+28 > >> > >> The lower offsets are when irq_count gets incremented and the higher > >> offsets are > >> when irq_count gets decremented before ending the callback. This shows > >> that at > >> least in this case there was an upcall, a softirq and another upcall being > >> triggered without finishing the previous one. There was another experiment > >> where > >> I saw softirq, upcall, upcall. But at least it seems that there is always a > >> three level stack. > >> I believe that softirq, upcall would be ok as softirq processing enables > >> interrupts but is it expected to have an upcall interrupted by another one? > > > > I'm not sure. evtchn's don't have a priority mechanism so I expect not > > in general but individual interrupt handlers could well reenable > > interrupts, I think (it might be one of the flags you pass when calling > > request_irq?). > > Just a gut feeling but it feels wrong to enable interrupts in any interrupt > handler. I thought for anything that needs interrupts enabled and/or takes > longer it should pushed off to a workqueue... Either that or some sort of interrupt prioritisation mechanism in the hardware so only higher priority interrupts than the current one will occur. > > > > >>> Can someone explain why the non-locked update of lock_spinners in > >>> spinning_lock() is safe against xen_spin_unlock reading the field from > >>> another VCPU? I suspect it isn't, so what happens if the other VCPU > >>> kicks the lock which is just about to become prev? maybe this is handled > >>> in xen_spin_lock_slow somehow? > >> > >> Isn't that safe because lock_spinners is percpu? > > > > xen_spin_unlock_slow() accesses the lock_spinners of remote CPUs, which > > is what I think might be unsafe. > > Ah right, that. Hm, the way those two play together always was a bit brain > hurting. Yes... > Though somehow it feels like if the top level poll_irq would return and > count things as spurious wakeup. I think in that case I would expect the > lock_spinner entry of one vcpu to be not matching the freed lock... > Not a really scientific argument. Adding a BUG_ON(__this_cpu_write(lock_spinners) != xl) in unspinning_lock might be interesting? I wonder if this is something where the ftrace and related kernel infrastructure might be useful? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.