[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: Still struggling with HVM: tx timeouts on emulated nics
On 03/10/11 19:13, Stefano Stabellini wrote: > CC'ing Jan, that probably is going to have an opinion on this. > > Let me add a bit of background: Stefan found out that PV on HVM guests > could loose level interrupts coming from emulated devices. Looking > through the code I realized that we need to add some logic to inject a > pirq in the guest if a level interrupt has been raised while the guest > is servicing the first one. > While this is all very specific to interrupt remapping and emulated > devices, I realized that something similar could happen even with dom0 > or other PV guests with PCI passthrough: > > 1) the device raises a level interrupt and xen injects it into the > guest; > > 2) the guest is temporarely stuck: it does not ack it or eoi it; > > 3) the xen timer kicks in and eois the interrupt; > > 4) the device thinks it is all fine and sends a second interrupt; > > 5) Xen fails to inject the second interrupt into the guest because the > guest has still the event channel pending bit set; > > at this point the guest looses the second interrupt notification, that > is not supposed to happen with level interrupts and I think it might > cause problems with some devices. > > Jan, do you think we should try to handle this case, or is it too > unlikely? I am not certain whether this is relevant, but the ICH10 IO-APIC documentation indicated that early EOI'ing of a line level interrupt should not have this effect. Specifically, it states that EOI'ing a line level interrupt whos line is still asserted will cause the interrupt to be "re-raised" from the IO-APIC. It uses this to assert that it is fine to use multiple IO-APIC entries with the same vector, with a broadcast of vector number alone to EOI the interrupt. In this case, while Xen sees two interrupts, from the devices point of view, only I has happened. In the case where the device has dropped its line level interrupt of its own accord, then I would agree that the current Xen behavior is wrong. I cant offhand think of a good reason why this would occur. I know it is not helpful in this case, but as a rule of thumb, line level interrupts should not be used with Xen. The average response time on an unloaded system is ~30ms, ranging from 5 to 150. (On a sample set of a Dell R710, Xen 4.1.0 and 2.6.32 dom0, over 2 weeks of debugging another line level interrupt bug). ~Andrew > In any case we need to handle the PV on HVM remapping bug, that because > of the way interrupts are emulated is much more likely to happen... > > > On Mon, 3 Oct 2011, Stefano Stabellini wrote: >> On Fri, 30 Sep 2011, Stefan Bader wrote: >>> Also I did not completely remove the section that would return the status >>> without setting needsEOI. I just changed the if condition to be <0 instead >>> of >>> <=0 (I knew from the tests that the mapping was always 0 and maybe the <0 >>> check >>> could be useful for something. >>> >>> irq_status_query.flags = 0; >>> if ( is_hvm_domain(v->domain) && >>> domain_pirq_to_irq(v->domain, irq) < 0 ) >>> { >>> ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; >>> break; >>> } >>> >> You need to remove the entire test because we want to receive >> notifications in all cases. >> >> >>> With that a quick test shows both the re-sends done sometimes and the domU >>> doing >>> EOIs. And there is no stall apparent. Did the same quick test with the e1000 >>> emulated NIC and that still seems ok. Those were not very thorough tests >>> but at >>> least I would have observed a stall pretty quick otherwise. >> I am glad it fixes the problem for you. >> >> I am going to send a different patch upstream for Xen 4.2, because I >> would also like it to cover the very unlikely scenario in which a PV >> guest (like dom0 or a PV guest with PCI passthrough) is loosing level >> interrupts because when Xen tries to set the corresponding event channel >> pending the bit is alreay set. The codebase is different enough that >> making the same change on 4.1 is non-trivial. I am appending the new >> patch to this email, it would be great if you could test it. You just >> need a 4.2 hypervisor, not the entire system. You should be able to >> perform the test updating only xen.gz. >> If you have trouble if xen-unstable.hg tip, try changeset 23843. >> >> --- >> >> >> diff -r bf533533046c xen/arch/x86/hvm/irq.c >> --- a/xen/arch/x86/hvm/irq.c Fri Sep 30 14:12:35 2011 +0000 >> +++ b/xen/arch/x86/hvm/irq.c Mon Oct 03 16:54:51 2011 +0000 >> @@ -36,7 +36,8 @@ static void assert_gsi(struct domain *d, >> >> if ( hvm_domain_use_pirq(d, pirq) ) >> { >> - send_guest_pirq(d, pirq); >> + if ( send_guest_pirq(d, pirq) && ioapic_gsi >= NR_ISAIRQS ) >> + pirq->lost++; >> return; >> } >> vioapic_irq_positive_edge(d, ioapic_gsi); >> @@ -63,6 +64,7 @@ static void __hvm_pci_intx_assert( >> { >> struct hvm_irq *hvm_irq = &d->arch.hvm_domain.irq; >> unsigned int gsi, link, isa_irq; >> + struct pirq *pirq; >> >> ASSERT((device <= 31) && (intx <= 3)); >> >> @@ -72,6 +74,11 @@ static void __hvm_pci_intx_assert( >> gsi = hvm_pci_intx_gsi(device, intx); >> if ( hvm_irq->gsi_assert_count[gsi]++ == 0 ) >> assert_gsi(d, gsi); >> + else { >> + pirq = pirq_info(d, domain_emuirq_to_pirq(d, gsi)); >> + if ( hvm_domain_use_pirq(d, pirq) ) >> + pirq->lost++; >> + } >> >> link = hvm_pci_intx_link(device, intx); >> isa_irq = hvm_irq->pci_link.route[link]; >> diff -r bf533533046c xen/arch/x86/irq.c >> --- a/xen/arch/x86/irq.c Fri Sep 30 14:12:35 2011 +0000 >> +++ b/xen/arch/x86/irq.c Mon Oct 03 16:54:51 2011 +0000 >> @@ -965,7 +965,11 @@ static void __do_IRQ_guest(int irq) >> !test_and_set_bool(pirq->masked) ) >> action->in_flight++; >> if ( !hvm_do_IRQ_dpci(d, pirq) ) >> - send_guest_pirq(d, pirq); >> + { >> + if ( send_guest_pirq(d, pirq) && >> + action->ack_type == ACKTYPE_EOI ) >> + pirq->lost++; >> + } >> } >> >> if ( action->ack_type != ACKTYPE_NONE ) >> diff -r bf533533046c xen/arch/x86/physdev.c >> --- a/xen/arch/x86/physdev.c Fri Sep 30 14:12:35 2011 +0000 >> +++ b/xen/arch/x86/physdev.c Mon Oct 03 16:54:51 2011 +0000 >> @@ -11,6 +11,7 @@ >> #include <asm/current.h> >> #include <asm/io_apic.h> >> #include <asm/msi.h> >> +#include <asm/hvm/irq.h> >> #include <asm/hypercall.h> >> #include <public/xen.h> >> #include <public/physdev.h> >> @@ -270,6 +271,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H >> if ( !is_hvm_domain(v->domain) || >> domain_pirq_to_irq(v->domain, eoi.irq) > 0 ) >> pirq_guest_eoi(pirq); >> + if ( pirq->lost > 0) { >> + if ( !send_guest_pirq(v->domain, pirq) ) >> + pirq->lost--; >> + } >> spin_unlock(&v->domain->event_lock); >> ret = 0; >> break; >> @@ -328,9 +333,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H >> break; >> irq_status_query.flags = 0; >> if ( is_hvm_domain(v->domain) && >> - domain_pirq_to_irq(v->domain, irq) <= 0 ) >> + domain_pirq_to_irq(v->domain, irq) <= 0 && >> + domain_pirq_to_emuirq(v->domain, irq) == IRQ_UNBOUND ) >> { >> - ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; >> + ret = -EINVAL; >> break; >> } >> >> diff -r bf533533046c xen/include/xen/irq.h >> --- a/xen/include/xen/irq.h Fri Sep 30 14:12:35 2011 +0000 >> +++ b/xen/include/xen/irq.h Mon Oct 03 16:54:51 2011 +0000 >> @@ -146,6 +146,7 @@ struct pirq { >> int pirq; >> u16 evtchn; >> bool_t masked; >> + u32 lost; >> struct rcu_head rcu_head; >> struct arch_pirq arch; >> }; >> > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |