[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 1/3] xen/pt: fix some pass-thru devices don't work across reboot
Thanks for the patch! On Fri, Jan 25, 2019 at 04:26:59PM +0800, Chao Gao wrote: > I find some pass-thru devices don't work any more across guest > reboot. Assigning it to another domain also meets the same issue. And > the only way to make it work again is un-binding and binding it to > pciback. Someone reported this issue one year ago [1]. > > If the device's driver doesn't disable MSI-X during shutdown or qemu is > killed/crashed before the domain shutdown, this domain's pirq won't be > unmapped. Then xen takes over this work, unmapping all pirq-s, when > destroying guest. But as pciback has already disabled meory decoding before > xen unmapping pirq, Xen has to sets the host_maskall flag and maskall bit ^ set > to mask a MSI rather than sets maskbit in MSI-x table. The call trace of ^ setting > this process is: > > ->arch_domain_destroy > ->free_domain_pirqs > ->unmap_domain_pirq (if pirq isn't unmapped by qemu) > ->pirq_guest_force_unbind > ->__pirq_guest_unbind > ->mask_msi_irq(=desc->handler->disable()) > ->the warning in msi_set_mask_bit() > > The host_maskall bit will prevent guests from clearing the maskall bit > even the device is assigned to another guest later. Then guests cannot > receive MSIs from this device. > > To fix this issue, a pirq is unmapped before memory decoding is disabled by > pciback. Specifically, when a device is detached from a guest, all established > mappings between pirq and msi are destroying before changing the ownership. ^ destroyed > > With this behavior, qemu and pciback are not aware of the forcibly unbindng ^ unbinding > and unmapping done by Xen. As a result, the state of pirq maintained by Xen > and > pciback/qemu becomes inconsistent. Particularly for hot-plug/hot-unplug case, > guests stay alive; such inconsistency may cause other issues. To resolve > this inconsistency and keep compatibility with current qemu and pciback, > two flags, force_unmapped and force_unbound are used to denote that a pirq is > forcibly unmapped or unbound. The flags are set when Xen unbinds or unmaps the > pirq behind qemu and pciback. And subsequent unbinding or unmapping requests > from qemu/pciback can clear these flags and free the pirq. What happens then if qemu/pciback doesn't unbind and/or unmap the pirqs, they would be left in a weird state that would prevent further mapping or binding? I think this is getting quite convoluted, and would like to make sure this is necessary. Last version triggered some error messages in Linux due to the unbind/unmap being performed by the hypervisor, but those where harmless? I've also suggested to return ESRCH in unmap_domain_pirq when the pirq is no longer mapped which would make Linux print a less scary message. Also Jan had some questions about where are the unbind/unmap requests actually coming from, he suggested to fix that in the toolstack if possible [0], have you checked whether this is possible? Thanks, Roger. [0] https://lists.xenproject.org/archives/html/xen-devel/2019-01/msg01753.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |