[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] RE: [PATCH] [RFC] VT-d: always clean up dpci timers.
At 18:08 -0700 on 20 Jul (1311185311), Kay, Allen M wrote: > Hi Tim, > > Can you provide the code flow that can cause this failure? > > In pci_release_devices(), pci_clean_dpci_irqs() is called before > "d->need_iommu = 0" in deassign_device(). If this path is taken, then > it should not return from !need_iommu(d) in pci_clean_dpci_irqs(). The problem is that the xl toolstack has already deassigned the domain's devices, using a hypercall to invoke deassign_device(), so by the time the domain is destroyed, pci_release_devices() can't tell that it once had a PCI device passed through. It seems like the Right Thing[tm] would be for deassign_device() to find and undo the relevant IRQ plumbing but I couldn't see how. Is there a mapping from bdf to irq in the iommu code or are they handled entirely separately? Tim. > Allen > > -----Original Message----- > From: Tim Deegan [mailto:Tim.Deegan@xxxxxxxxxx] > Sent: Monday, July 18, 2011 9:39 AM > To: xen-devel@xxxxxxxxxxxxxxxxxxx > Cc: keir@xxxxxxx; Kay, Allen M > Subject: [PATCH] [RFC] VT-d: always clean up dpci timers. > > If a VM has all its PCI devices deassigned, need_iommu(d) becomes false > but it might still have DPCI EOI timers that were init_timer()d but not > yet kill_timer()d. That causes xen to crash later because the linked > list of inactive timers gets corrupted, e.g.: > > (XEN) Xen call trace: > (XEN) [<ffff82c480126256>] set_timer+0x1c2/0x24f > (XEN) [<ffff82c48011fbf8>] schedule+0x129/0x5dd > (XEN) [<ffff82c480122c1e>] __do_softirq+0x7e/0x89 > (XEN) [<ffff82c480122c9d>] do_softirq+0x26/0x28 > (XEN) [<ffff82c480153c85>] idle_loop+0x5a/0x5c > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Assertion 'entry->next->prev == entry' failed at > /local/scratch/tdeegan/xen-unstable.hg/xen/include:172 > (XEN) **************************************** > > The following patch makes sure that the domain destruction path always > clears up the DPCI state even if !needs_iommu(d). > > Although it fixes the crash for me, I'm sufficiently confused by this > code that I don't know whether it's enough. If the dpci timer state > gets freed earlier than pci_clean_dpci_irqs() then there's still a race, > and some other function (reassign_device_ownership() ?) needs to sort > out the timers when the PCI card is deassigned. > > Allen, can you comment? > > Signed-off-by: Tim Deegan <Tim.Deegan@xxxxxxxxxx> > > diff -r ab6551e30841 xen/drivers/passthrough/pci.c > --- a/xen/drivers/passthrough/pci.c Mon Jul 18 10:59:44 2011 +0100 > +++ b/xen/drivers/passthrough/pci.c Mon Jul 18 17:22:48 2011 +0100 > @@ -269,7 +269,7 @@ static void pci_clean_dpci_irqs(struct d > if ( !iommu_enabled ) > return; > > - if ( !is_hvm_domain(d) || !need_iommu(d) ) > + if ( !is_hvm_domain(d) ) > return; > > spin_lock(&d->event_lock); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel -- Tim Deegan <Tim.Deegan@xxxxxxxxxx> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |