[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 1/2] xen/pt: fix some pass-thru devices don't work across reboot
>>> On 18.12.18 at 15:43, <chao.gao@xxxxxxxxx> wrote: > I find some pass-thru devices don't work any more across guest > reboot. Assigning it to another domain also meets the same issue. And > the only way to make it work again is un-binding and binding it to > pciback. Someone reported this issue one year ago [1]. > > If the device's driver doesn't disable MSI-X during shutdown or qemu is > killed/crashed before the domain shutdown, this domain's pirq won't be > unmapped. Then xen takes over this work, unmapping all pirq-s, when > destroying guest. But as pciback has already disabled meory decoding before > xen unmapping pirq, Xen has to sets the host_maskall flag and maskall bit > to mask a MSI rather than sets maskbit in MSI-x table. The call trace of > this process is: > > ->arch_domain_destroy > ->free_domain_pirqs > ->unmap_domain_pirq (if pirq isn't unmapped by qemu) > ->pirq_guest_force_unbind > ->__pirq_guest_unbind > ->mask_msi_irq(=desc->handler->disable()) > ->the warning in msi_set_mask_bit() > > The host_maskall bit will prevent guests from clearing the maskall bit > even the device is assigned to another guest later. Then guests cannot > receive MSIs from this device. > > To fix this issue, a pirq is unmapped before memory decoding is disabled by > pciback. Specifically, when a device is detached from a guest, all established > mappings between pirq and msi are destroying before changing the ownership. > > [1]: > https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg02520.html > > Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> > --- > Applied this patch, qemu would report the error below: > [00:05.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, > pirq: 302, gvec: 0xd5) > [00:05.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, > pirq: 301, gvec: 0xe5) > [00:04.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, > pirq: 359, gvec: 0x41) > [00:04.0] msi_msix_disable: Error: Unbinding of MSI-X failed. (err: 1, > pirq: 358, gvec: 0x51) > > Despite of the error, guest shutdown or device hotplug finishs smoothly. > It seems to me that qemu tries to unbind a msi which is already unbound by > the code added by this patch. I am not sure whether it is acceptable to > leave this error there. Well, the errors mean that qemu is playing with a device that's no longer owned by the guest controlled by this qemu instance. At least with a de-privileged qemu (no idea whether this actually works with pass-through) that's still a mistake, and hence would need fixing. Whichever entity it is that invokes the de-assign of the device, other involved parties should be informed so that they can keep their hands off the device from that point onwards. The hypervisor change itself looks mostly fine, just a few minor comments. > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -368,6 +368,7 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, > u8 bus, u8 devfn) > return NULL; > } > spin_lock_init(&msix->table_lock); > + msix->warned = DOMID_INVALID; This is an arch-specific field right now; in fact the entire structure is arch-specific. Playing with any of its fields in common code is undesirable, but I guess the use of ->table_lock can be taken as an excuse until this code wants to eventually be used by Arm. (The structure requiring a lock is sufficiently generic, whereas the "warned" field may not be universally needed.) > @@ -1514,6 +1515,52 @@ static int assign_device(struct domain *d, u16 seg, u8 > bus, u8 devfn, u32 flag) > return rc; > } > > +/* > + * Unmap established mappings between domain's pirq and device's MSI. > + * These mappings were set up by qemu/guest and are expected to be > + * destroyed when changing the device's ownership. > + */ > +static void pci_unmap_msi(struct pci_dev *pdev) > +{ > + struct msi_desc *entry, *tmp; > + > + ASSERT(pcidevs_locked()); > + > + if ( !pdev->domain ) There are quite a few uses of pdev->domain - please consider using a local variable. > + return; > + > + spin_lock(&pdev->domain->event_lock); > + list_for_each_entry_safe( entry, tmp, &pdev->msi_list, list ) > + { > + struct pirq *info; > + struct hvm_pirq_dpci *pirq_dpci; > + int pirq = domain_irq_to_pirq(pdev->domain, entry->irq), pirq_orig; > + > + pirq_orig = pirq; > + > + if ( !pirq ) > + continue; > + > + /* For forcibly unmapped pirq, lookup radix tree with absolute value > */ > + if ( pirq < 0) > + pirq = -pirq; > + > + info = pirq_info(pdev->domain, pirq); Why not simply info = pirq_info(pdev->domain, ABS(pirq)); without any pirq_orig? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |