[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] xen/pt: fix some pass-thru devices don't work across reboot



On Fri, Nov 16, 2018 at 03:53:50PM +0800, Chao Gao wrote:
> On Thu, Nov 15, 2018 at 11:40:39AM +0100, Roger Pau Monné wrote:
> >On Thu, Nov 15, 2018 at 09:10:26AM +0800, Chao Gao wrote:
> >> I find some pass-thru devices don't work any more across guest
> >> reboot. Assigning it to another domain also meets the same issue. And
> >> the only way to make it work again is un-binding and binding it to
> >> pciback. Someone reported this issue one year ago [1].
> >
> >How does unbinding and binding to pciback fix the issue? Is this due
> >to Dom0 using some PV or Dom0 only hypercall that can reset the
> >host_maskall state?
> 
> I think it is msix_capability_init() that clear host_maskall. And this
> function is called by PHYSDEVOP_prepare_msix sub-hypercall.
> 
> >
> >> 
> >> If the device's driver doesn't disable MSI-X during shutdown or qemu is
> >> killed/crashed before the domain shutdown, this domain's pirq won't be
> >> unmapped. Then xen will unmap all pirq. But pciback has already disabled
> >> meory decoding before xen unmapping pirq. Then when Xen is disabling a
> >> MSI of the device, it has to sets the host_maskall flag and maskall bit
> >> to mask a MSI rather than sets maskbit in MSI-x table. The call trace of
> >> this process is:
> >> ->arch_domain_destroy
> >>     ->free_domain_pirqs
> >>         ->unmap_domain_pirq (if pirq isn't unmap by qemu)
> >>             ->pirq_guest_force_unbind
> >>                 ->__pirq_guest_unbind
> >>                     ->mask_msi_irq(=desc->handler->disable())
> >>                         ->the warning in msi_set_mask_bit()
> >> 
> >> The host_maskall bit will prevent guests from clearing the maskall bit
> >> even the device is assigned to another guest later. Guests cannot
> >> receive interrupts from this device.
> >>
> >> 
> >> To fix this, host_maskall flag is cleared when all MSIs of a device are 
> >> freed.
> >> It is definitely safely to clear it because no msi is actually set up
> >> for this device. Also, 'msix->warned' is initialized to DOMID_INVALID
> >> rather than 0 to avoid warnings missing for Dom0.
> >> 
> >> [1]: 
> >> https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg02520.html
> >> 
> >> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
> >> ---
> >>  xen/arch/x86/msi.c            | 18 ++++++++++++++++++
> >>  xen/drivers/passthrough/pci.c |  1 +
> >>  2 files changed, 19 insertions(+)
> >> 
> >> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> >> index 5567990..cd570bc 100644
> >> --- a/xen/arch/x86/msi.c
> >> +++ b/xen/arch/x86/msi.c
> >> @@ -637,6 +637,7 @@ int msi_free_irq(struct msi_desc *entry)
> >>  {
> >>      unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
> >>                        ? entry->msi.nvec : 1;
> >> +    struct pci_dev *pdev = entry->dev;
> >>  
> >>      while ( nr-- )
> >>      {
> >> @@ -654,6 +655,23 @@ int msi_free_irq(struct msi_desc *entry)
> >>  
> >>      list_del(&entry->list);
> >>      xfree(entry);
> >> +
> >> +    /*
> >> +     * In some cases, the 'host_maskall' is set for safety. Here clear
> >> +     * 'host_maskall' if all MSIs are freed for a msi-x capable device.
> >> +     * Without it, the device's msix keeps disabled even been reassigned
> >
> >"... even after being reassigned ..."
> >
> >I think it's clearer.
> >
> >> +     * to another domain.
> >> +     */
> >> +    if ( pdev && list_empty(&pdev->msi_list) && pdev->msix )
> >> +    {
> >> +        if ( pdev->msix->host_maskall )
> >> +            printk(XENLOG_G_WARNING
> >> +                   "Resetting msix status of %04x:%02x:%02x.%u\n",
> >> +                   pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> >> +                   PCI_FUNC(pdev->devfn));
> >> +        pdev->msix->host_maskall = false;
> >> +        pdev->msix->warned = DOMID_INVALID;

AFAICT a guest could trigger this message multiple times by forcing a
PIRQ map/unmap of all the vectors in MSIX, thus likely flooding the
console since this is not rate limited. Since I think a guest can
manage to reach this code path while running, clearing warned is not
correct.

Also, if a guest can manage to trigger this path during it's runtime,
could it also hit the issue of getting host_maskall set and not being
able to clear it?

> >> +    }
> >>      return 0;
> >
> >IMO this looks quite similar to a msi_reset_state function or at least
> >the start of it.
> >
> >Maybe it should be in a separate helper that sets all the fields to
> >their initial values,
> 
> Will do.
> 
> >with the expectation that Dom0 will perform the
> >device reset?
> 
> Dom0 resets devices when (de-)assignment by echo 1 to
> /sys/bus/pci/devices/<sbdf>/reset.
> 
> >
> >The underlying problem here AFAICT is that the Xen internal device
> >state is not the same between device assignments.
> 
> Yes, it is accurate.
> 
> >
> >In any case there should be at least a note here pointing out that Xen
> >expects the hardware domain to perform a device reset, so the Xen
> >internal state actually matches the device state before trying to
> >assign the device to another guest.
> 
> Sounds good. This issue is that Xen tries to mask msi (when unmapping pirq)
> after memory decoding is disabled by pciback. If pciback can unmap all the
> pirq-s related a given device before disabling memory decoding, Xen won't meet
> this issue. Currently, pciback doesn't maintain the pirq information; it
> isn't able to do this.

I would like to hear Jan's opinion on this, but I think it might be
helpful to introduce a new hypercall Dom0 (ie: toolstack) can use to
signal Xen a PCI device has been reset, so that Xen can safely reset
the device state to the initial one. This would be simpler if Xen was
the one performing the device reset.

Once a device has been assigned it would become 'dirty' and would
require such reset before it could be assigned again to a domain.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.