[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 1/3] xen/pt: fix some pass-thru devices don't work across reboot
On Fri, Jan 25, 2019 at 10:36:13AM +0100, Roger Pau Monné wrote: >Thanks for the patch! > >On Fri, Jan 25, 2019 at 04:26:59PM +0800, Chao Gao wrote: >> I find some pass-thru devices don't work any more across guest >> reboot. Assigning it to another domain also meets the same issue. And >> the only way to make it work again is un-binding and binding it to >> pciback. Someone reported this issue one year ago [1]. >> >> If the device's driver doesn't disable MSI-X during shutdown or qemu is >> killed/crashed before the domain shutdown, this domain's pirq won't be >> unmapped. Then xen takes over this work, unmapping all pirq-s, when >> destroying guest. But as pciback has already disabled meory decoding before >> xen unmapping pirq, Xen has to sets the host_maskall flag and maskall bit > ^ set >> to mask a MSI rather than sets maskbit in MSI-x table. The call trace of > ^ setting >> this process is: >> >> ->arch_domain_destroy >> ->free_domain_pirqs >> ->unmap_domain_pirq (if pirq isn't unmapped by qemu) >> ->pirq_guest_force_unbind >> ->__pirq_guest_unbind >> ->mask_msi_irq(=desc->handler->disable()) >> ->the warning in msi_set_mask_bit() >> >> The host_maskall bit will prevent guests from clearing the maskall bit >> even the device is assigned to another guest later. Then guests cannot >> receive MSIs from this device. >> >> To fix this issue, a pirq is unmapped before memory decoding is disabled by >> pciback. Specifically, when a device is detached from a guest, all >> established >> mappings between pirq and msi are destroying before changing the ownership. > ^ destroyed >> >> With this behavior, qemu and pciback are not aware of the forcibly unbindng > ^ > unbinding >> and unmapping done by Xen. As a result, the state of pirq maintained by Xen >> and >> pciback/qemu becomes inconsistent. Particularly for hot-plug/hot-unplug case, >> guests stay alive; such inconsistency may cause other issues. To resolve >> this inconsistency and keep compatibility with current qemu and pciback, >> two flags, force_unmapped and force_unbound are used to denote that a pirq is >> forcibly unmapped or unbound. The flags are set when Xen unbinds or unmaps >> the >> pirq behind qemu and pciback. And subsequent unbinding or unmapping requests >> from qemu/pciback can clear these flags and free the pirq. > >What happens then if qemu/pciback doesn't unbind and/or unmap the >pirqs, they would be left in a weird state that would prevent further >mapping or binding? Yes. > >I think this is getting quite convoluted, and would like to make sure >this is necessary. Last version triggered some error messages in Linux >due to the unbind/unmap being performed by the hypervisor, but those >where harmless? Yes. I didn't see anything harmful. > >I've also suggested to return ESRCH in unmap_domain_pirq when the pirq >is no longer mapped which would make Linux print a less scary message. But, with this version, Qemu/pciback won't complain about anything. The idea is inspired by the way we handle "force_unbind" in current unmap_domain_pirq. Personally, I think this version is slightly better. But because it is more complicated and involves more changes, I am less confident in this version. Thanks Chao > >Also Jan had some questions about where are the unbind/unmap requests >actually coming from, he suggested to fix that in the toolstack if >possible [0], have you checked whether this is possible? > >Thanks, Roger. > >[0] https://lists.xenproject.org/archives/html/xen-devel/2019-01/msg01753.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |