[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v5 1/3] xen/pt: fix some pass-thru devices don't work across reboot
On Wed, Jan 16, 2019 at 04:17:30PM +0800, Chao Gao wrote: > I find some pass-thru devices don't work any more across guest > reboot. Assigning it to another domain also meets the same issue. And > the only way to make it work again is un-binding and binding it to > pciback. Someone reported this issue one year ago [1]. > > If the device's driver doesn't disable MSI-X during shutdown or qemu is > killed/crashed before the domain shutdown, this domain's pirq won't be > unmapped. Then xen takes over this work, unmapping all pirq-s, when > destroying guest. But as pciback has already disabled meory decoding before > xen unmapping pirq, Xen has to sets the host_maskall flag and maskall bit > to mask a MSI rather than sets maskbit in MSI-x table. The call trace of > this process is: > > ->arch_domain_destroy > ->free_domain_pirqs > ->unmap_domain_pirq (if pirq isn't unmapped by qemu) > ->pirq_guest_force_unbind > ->__pirq_guest_unbind > ->mask_msi_irq(=desc->handler->disable()) > ->the warning in msi_set_mask_bit() > > The host_maskall bit will prevent guests from clearing the maskall bit > even the device is assigned to another guest later. Then guests cannot > receive MSIs from this device. > > To fix this issue, a pirq is unmapped before memory decoding is disabled by > pciback. Specifically, when a device is detached from a guest, all established > mappings between pirq and msi are destroying before changing the ownership. > > [1]: > https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg02520.html Thanks, I think the approach is fine, just a couple of comments. > Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> > --- > Changes in v5: > - fix the potential infinite loop > - assert that unmap_domain_pirq() won't fail > - assert msi_list is empty after the loop in pci_unmap_msi > - provide a stub for pt_irq_destroy_bind_msi() if !CONFIG_HVM to fix a > compilation error when building PVShim > > Changes in v4: > - split out change to 'msix->warned' field > - handle multiple msi cases > - use list_first_entry_or_null to traverse 'pdev->msi_list' > --- > xen/drivers/passthrough/io.c | 57 ++++++++++++++++++++++++++------------ > xen/drivers/passthrough/pci.c | 64 > +++++++++++++++++++++++++++++++++++++++++++ > xen/include/xen/iommu.h | 4 +++ > 3 files changed, 107 insertions(+), 18 deletions(-) > > diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c > index a6eb8a4..56ee1ef 100644 > --- a/xen/drivers/passthrough/io.c > +++ b/xen/drivers/passthrough/io.c > @@ -619,6 +619,42 @@ int pt_irq_create_bind( > return 0; > } > > +static void pt_irq_destroy_bind_common(struct domain *d, struct pirq *pirq) > +{ > + struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq); > + > + ASSERT(spin_is_locked(&d->event_lock)); > + > + if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) && > + list_empty(&pirq_dpci->digl_list) ) > + { > + pirq_guest_unbind(d, pirq); > + msixtbl_pt_unregister(d, pirq); > + if ( pt_irq_need_timer(pirq_dpci->flags) ) > + kill_timer(&pirq_dpci->timer); > + pirq_dpci->flags = 0; > + /* > + * See comment in pt_irq_create_bind's PT_IRQ_TYPE_MSI before the > + * call to pt_pirq_softirq_reset. > + */ > + pt_pirq_softirq_reset(pirq_dpci); > + > + pirq_cleanup_check(pirq, d); > + } > +} > + > +void pt_irq_destroy_bind_msi(struct domain *d, struct pirq *pirq) > +{ > + struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq); const > + > + ASSERT(spin_is_locked(&d->event_lock)); > + > + if ( pirq_dpci && pirq_dpci->gmsi.posted ) > + pi_update_irte(NULL, pirq, 0); > + > + pt_irq_destroy_bind_common(d, pirq); > +} > + > int pt_irq_destroy_bind( > struct domain *d, const struct xen_domctl_bind_pt_irq *pt_irq_bind) > { > @@ -727,26 +763,11 @@ int pt_irq_destroy_bind( > } > else > what = "bogus"; > - } > - else if ( pirq_dpci && pirq_dpci->gmsi.posted ) > - pi_update_irte(NULL, pirq, 0); > - > - if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) && > - list_empty(&pirq_dpci->digl_list) ) > - { > - pirq_guest_unbind(d, pirq); > - msixtbl_pt_unregister(d, pirq); > - if ( pt_irq_need_timer(pirq_dpci->flags) ) > - kill_timer(&pirq_dpci->timer); > - pirq_dpci->flags = 0; > - /* > - * See comment in pt_irq_create_bind's PT_IRQ_TYPE_MSI before the > - * call to pt_pirq_softirq_reset. > - */ > - pt_pirq_softirq_reset(pirq_dpci); > > - pirq_cleanup_check(pirq, d); > + pt_irq_destroy_bind_common(d, pirq); > } > + else > + pt_irq_destroy_bind_msi(d, pirq); > > spin_unlock(&d->event_lock); > > diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c > index 93c20b9..4f2be02 100644 > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -1514,6 +1514,68 @@ static int assign_device(struct domain *d, u16 seg, u8 > bus, u8 devfn, u32 flag) > return rc; > } > > +/* > + * Unmap established mappings between domain's pirq and device's MSI. > + * These mappings were set up by qemu/guest and are expected to be > + * destroyed when changing the device's ownership. > + */ > +static void pci_unmap_msi(struct pci_dev *pdev) > +{ > + struct msi_desc *entry, *tmp; > + struct domain *d = pdev->domain; > + > + ASSERT(pcidevs_locked()); > + ASSERT(d); > + > + spin_lock(&d->event_lock); > + list_for_each_entry_safe(entry, tmp, &pdev->msi_list, list) > + { > + struct pirq *info; > + int ret, pirq = 0; > + unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX > + ? entry->msi.nvec : 1; I think you should mask the entry, like it's done in pt_irq_destroy_bind, see the call to guest_mask_msi_irq. That gives a consistent state between bind and unbind. > + > + while ( nr-- ) > + { > + struct hvm_pirq_dpci *pirq_dpci; Nit: you could reduce the scope of info by declaring it here AFAICT. > + > + pirq = domain_irq_to_pirq(d, entry[nr].irq); > + WARN_ON(pirq < 0); > + if ( pirq <= 0 ) > + continue; > + > + info = pirq_info(d, pirq); > + if ( !info ) > + continue; > + > + pirq_dpci = pirq_dpci(info); > + if ( pirq_dpci && > + (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) && > + (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) ) > + pt_irq_destroy_bind_msi(d, info); > + } > + > + if ( pirq > 0 ) > + { > + /* > + * The pirq is derived from an entry in msi_list rather than an > + * arbitrary value passed down. There should be a irq (msi) > mapped > + * to this pirq. In this case, unmapping this pirq should > succeed. > + * Otherwise, something goes wrong. > + */ > + ret = unmap_domain_pirq(d, pirq); > + ASSERT(!ret); unmap_domain_pirq can fail, why not make pci_unmap_msi return an int and propagate the error to the caller? deassign_device returning an error should also be fine. > + } > + } > + /* > + * All pirq-s should have been unmapped and corresponding msi_desc > + * entries should have been removed in the above loop. > + */ > + ASSERT(list_empty(&pdev->msi_list)); > + > + spin_unlock(&d->event_lock); > +} > + > /* caller should hold the pcidevs_lock */ > int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn) > { > @@ -1529,6 +1591,8 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, > u8 devfn) > if ( !pdev ) > return -ENODEV; > > + pci_unmap_msi(pdev); Just want to make sure, since deassign_device will be called for both PV and HVM domains. AFAICT pci_unmap_msi is safe to call when the device is assigned to a PV guest, but would like your confirmation. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |