[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Strange PVM spinlock case revisited
On 11.02.2013 18:29, Ian Campbell wrote: > An interesting hack^Wexperiment might be to make xen_poll_irq use a > timeout and see if that unwedges things -- this would help confirm that > the issue is on nested wakeup. > So I did go forward and replaced xen_poll_irq by xen_poll_irq_timeout and it did get rid of the hang. Though I think there is a big taint there. There was only one other user of poll_irq_timeout in the kernel code. And that uses "jiffies + <timeout>*HZ". But when I look at the Xen side in do_poll, that looks like it is using timeout in a absolute "ns since boot" (of hv/dom0) way. Not sure how that ever can work. The ns since boot in the guest clearly is always behind the host (and jiffies isn't ns either). Effectively I likely got rid of any wait time in the hypervisor and back to mostly spinning. Which matches the experience that the test run never gets stuck waiting for a timeout. That maybe proves the stacking is an issue but also is likely a bit too aggressive in not having any... :/ I will try to think of some better way. Not sure the thinking is realistic but maybe that could happen: xen_spin_lock_slow(a) ... enables irq and upcalls are pending upcall processing wants lock b xen_spin_lock_slow(b) --- just before replacing lock_spinners --- xen_spin_unlock_slow(a) finds other vcpu, triggers IRQ lock b is top spinner going into poll_irq poll_irq returns lock a gets restored so maybe no spinners on b dropping out to xen_spin_lock unlock of b not finding any spinners lock b acquired That way the irq for lock a maybe get lost... Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |