[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 1/2] xen/pt: fix some pass-thru devices don't work across reboot
On Tue, Dec 18, 2018 at 08:53:46AM -0700, Jan Beulich wrote: >>>> On 18.12.18 at 15:43, <chao.gao@xxxxxxxxx> wrote: >> I find some pass-thru devices don't work any more across guest >> reboot. Assigning it to another domain also meets the same issue. And >> the only way to make it work again is un-binding and binding it to >> pciback. Someone reported this issue one year ago [1]. >> >> If the device's driver doesn't disable MSI-X during shutdown or qemu is >> killed/crashed before the domain shutdown, this domain's pirq won't be >> unmapped. Then xen takes over this work, unmapping all pirq-s, when >> destroying guest. But as pciback has already disabled meory decoding before >> xen unmapping pirq, Xen has to sets the host_maskall flag and maskall bit >> to mask a MSI rather than sets maskbit in MSI-x table. The call trace of >> this process is: >> >> ->arch_domain_destroy >> ->free_domain_pirqs >> ->unmap_domain_pirq (if pirq isn't unmapped by qemu) >> ->pirq_guest_force_unbind >> ->__pirq_guest_unbind >> ->mask_msi_irq(=desc->handler->disable()) >> ->the warning in msi_set_mask_bit() >> >> The host_maskall bit will prevent guests from clearing the maskall bit >> even the device is assigned to another guest later. Then guests cannot >> receive MSIs from this device. >> >> To fix this issue, a pirq is unmapped before memory decoding is disabled by >> pciback. Specifically, when a device is detached from a guest, all >> established >> mappings between pirq and msi are destroying before changing the ownership. >> >> [1]: >> https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg02520.html >> >> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> >> --- >> Applied this patch, qemu would report the error below: >> [00:05.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, >> pirq: 302, gvec: 0xd5) >> [00:05.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, >> pirq: 301, gvec: 0xe5) >> [00:04.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, >> pirq: 359, gvec: 0x41) >> [00:04.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, >> pirq: 358, gvec: 0x51) >> >> Despite of the error, guest shutdown or device hotplug finishs smoothly. >> It seems to me that qemu tries to unbind a msi which is already unbound by >> the code added by this patch. I am not sure whether it is acceptable to >> leave this error there. > >Well, the errors mean that qemu is playing with a device that's no >longer owned by the guest controlled by this qemu instance. At >least with a de-privileged qemu (no idea whether this actually works >with pass-through) that's still a mistake, and hence would need >fixing. Whichever entity it is that invokes the de-assign of the >device, other involved parties should be informed so that they can >keep their hands off the device from that point onwards. > >The hypervisor change itself looks mostly fine, just a few minor >comments. > >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -368,6 +368,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, >> u8 bus, u8 devfn) >> return NULL; >> } >> spin_lock_init(&msix->table_lock); >> + msix->warned = DOMID_INVALID; > >This is an arch-specific field right now; in fact the entire structure >is arch-specific. Playing with any of its fields in common code is >undesirable, but I guess the use of ->table_lock can be taken as >an excuse until this code wants to eventually be used by Arm. >(The structure requiring a lock is sufficiently generic, whereas >the "warned" field may not be universally needed.) I will clean up this place. > >> @@ -1514,6 +1515,52 @@ static int assign_device(struct domain *d, u16 seg, >> u8 bus, u8 devfn, u32 flag) >> return rc; >> } >> >> +/* >> + * Unmap established mappings between domain's pirq and device's MSI. >> + * These mappings were set up by qemu/guest and are expected to be >> + * destroyed when changing the device's ownership. >> + */ >> +static void pci_unmap_msi(struct pci_dev *pdev) >> +{ >> + struct msi_desc *entry, *tmp; >> + >> + ASSERT(pcidevs_locked()); >> + >> + if ( !pdev->domain ) > >There are quite a few uses of pdev->domain - please consider >using a local variable. > >> + return; >> + >> + spin_lock(&pdev->domain->event_lock); >> + list_for_each_entry_safe( entry, tmp, &pdev->msi_list, list ) >> + { >> + struct pirq *info; >> + struct hvm_pirq_dpci *pirq_dpci; >> + int pirq = domain_irq_to_pirq(pdev->domain, entry->irq), pirq_orig; >> + >> + pirq_orig = pirq; >> + >> + if ( !pirq ) >> + continue; >> + >> + /* For forcibly unmapped pirq, lookup radix tree with absolute >> value */ >> + if ( pirq < 0) >> + pirq = -pirq; >> + >> + info = pirq_info(pdev->domain, pirq); > >Why not simply > > info = pirq_info(pdev->domain, ABS(pirq)); > >without any pirq_orig? Will do. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |