[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M)
>>> On 17.08.10 at 20:01, Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote: > On 17/08/2010 18:28, "Bruce Edge" <bruce.edge@xxxxxxxxx> wrote: > >> On Tue, Jun 29, 2010 at 1:42 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote: >>>>>> On 28.06.10 at 20:22, Dante Cinco <dantecinco@xxxxxxxxx> wrote: >>>> I have an HP Proliant DL380-G6 (dual Xeon E5540 @ 2.53GHz) with Xen 4.0.0 >>>> and dom0 Linux 2.6.32.12 x86_64 pvops and domU Linux kernel 2.6.30.1 >>>> x86_64. >>>> I'm using PCI passthrough (pci-stub) to pass my 4-port 8Gb PMC-Sierra Fibre >>>> Channel HBA to domU. After running I/Os for several hours, both dom0 and >>>> domU hangs and the Xen console shows the interrupt binding below where IRQ >>>> 66 shows in-flight=1 and mask set (---M). What's the best way to debug this >>>> problem? >>> >>> There are potentially two problems here: One is that the guest may >>> fail to send the EOI notification. You would want to check whether >>> pirq_guest_eoi() got run after that last occurrence of the interrupt. >>> >>> The more worrying part is that Xen should time out on a guest failing >>> to send the EOI notification, and ack the interrupt nevertheless. >>> Looking at the code I fail to see how the ack_APIC_irq() would get >>> sent in this case: non-maskable MSIs get this issued from >>> end_msi_irq(), but ->end doesn't get invoked from >>> irq_guest_eoi_timer_fn() (only ->enable does). Keir, am I missing >>> something? > > I don't think that timer logic is designed to handle non-maskable MSIs, only > maskable ones. It ought to be not too hard to fix it up for non-maskable > ones too by issuing the ->end() call from the timer handler? Yes, that was what I was trying to hint at, but I wasn't sure whether calling ->end() here has any unintended side effects and/or requires any extra care (like preventing a subsequent guest initiated EOI to call ->end() again). While looking at this I came across another thing I don't understand: __pirq_guest_eoi(), for the ACKTYPE_EOI case, calls __set_eoi_ready() in a cpu_test_and_clear() conditional, but __set_eoi_ready() bails out if it finds !cpu_test_and_clear() on the same bitmap - what's the point of calling __set_eoi_ready() here then (or what am I missing)? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |