[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)



On Fri, 2009-10-16 at 16:35 +0800, Zhang, Xiantao wrote:
> He, Qing wrote:
> > On Fri, 2009-10-16 at 16:22 +0800, Zhang, Xiantao wrote:
> >> He, Qing wrote:
> >>> On Fri, 2009-10-16 at 15:32 +0800, Zhang, Xiantao wrote:
> >>>> According to the description, the issue should be caused by lost
> >>>> EOI write for the MSI interrupt and leads to permanent interrupt
> >>>> mask. There should be a race between guest setting new vector and 
> >>>> EOIs old vector for the interrupt.  Once guest sets new vector
> >>>> before it EOIs the old vector, hypervisor can't find the pirq which
> >>>> corresponds old vector(has changed
> >>>> to new vector) , so also can't EOI the old vector forever in
> >>>> hardware level. Since the corresponding vector in real processor
> >>>> can't be EOIed, so system may lose all interrupts and result the
> >>>> reported issues ultimately.
> >>> 
> >>>> But I remembered there should be a timer to handle this case
> >>>> through a forcible EOI write to the real processor after timeout,
> >>>> but seems it doesn't function in the expected way.
> >>> 
> >>> The EOI timer is supposed to deal with the irq sharing problem,
> >>> since MSI doesn't share, this timer will not be started in the
> >>> case of MSI.
> >> 
> >> That maybe a problem if so. If a malicious/buggy guest won't EOI the
> >> MSI vector, so host may hang due to lack of timeout mechanism?
> > 
> > Why does host hang? Only the assigned interrupt will block, and that's
> > exactly what the guest wants :-)
> 
> Hypervisor shouldn't EOI the real vector until guest EOI the corresponding
> virtual vector , right ?  Not sure.:-)

Yes, it is the algorithm used today.

After reviewing the code, if the guest really does something like
changing affinity within the window between an irq fire and eoi,
there is indeed a problem, attached is the patch. Although I kinda
doubt it, shouldn't desc->lock in guest protect and make these two
operations mutual exclusive.

Dante,
Can you see if this patch helps?

Thanks,
Qing

Attachment: msi-eoi-before-update.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.