[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] megasas stops I/O when running kernel as dom0 under xen4.1/4.2



On Wed, Aug 24, 2011 at 05:57:06PM +0100, Andrew Cooper wrote:
> On 24/08/11 13:06, Andrew Cooper wrote:
> > On 22/08/11 10:05, Andrew Cooper wrote:
> >> On 19/08/11 19:10, Andreas Olsowski wrote:
> >>> Am 19.08.2011 18:49, schrieb Andrew Cooper:
> >>>
> >>>> The only change you need to make is in megasas_probe_one() in
> >>>> megaraid_sas_base.c
> >>>>
> >>>> Add a call to pci_enable_msi(pdev) immediately after the current
> >>> call to
> >>>> pci_set_master(pdev);
> >>>>
> >>>> ~Andrew
> >>>>
> >>> Yep, that works fine. Removed the module option as well.
> >>>
> >>> root@tarballerina:~# cat /proc/interrupts  |grep mega
> >>> 2236:      69010          0          0          0          0         
> >>> 0          0          0  xen-pirq-msi       megasas
> >>>
> >>> The same procedure that would have lead to almost instant errors has
> >>> not brought them to appear again.
> >>>
> >> Good.  This is what we are seeing as well.  I am still awaiting a reply
> >> from LSI on this topic.
> >>
> >> Unfortunately, this does point to a regression in the way Xen deals with
> >> legacy interrupts.
> > Out of interest, on all 3 of your boxes with the megaraid_sas cards,
> > could you gather the io_apic information?
> >
> > It is the z xen debug key on the serial console (or alternatively put
> > apic_verbosity=debug on the xen commandline and the information gets
> > dumped into the dmesg)
> 
> You can ignore this - it is not relevant.
> 
> I have narrowed the problem to a bug in the interrupt migration code.

Goodies!
> 
> The bug occurs when the move pending flag is set, and somehow another
> interrupt comes in on the old pcpu without triggering the move
> completion code.  This leaves the IO_APIC with ack'd but not EOI'd
> interrupt from the megaraid_sas device.

Ah, so the interrupt is delievered to Dom0 on the old per_cpu
event which is ignored. Ignored b/c we have rebinded the event channel
to the other CPU, right?

Is there any code in the Hypervisor to turn off interrupt migration code?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.