[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/2] libxl: Wait until QEMU removed the device before tearing it down
Friday, November 14, 2014, 10:09:04 PM, you wrote: > On 11/14/2014 02:24 PM, Sander Eikelenboom wrote: >> Friday, November 14, 2014, 7:07:46 PM, you wrote: >> >>> Sander Eikelenboom writes ("Re: [Xen-devel] [PATCH 1/2] libxl: Wait until >>> QEMU removed the device before tearing it down"): >>>> 1) xc_physdev_unmap_pirq does get called when destroying a HVM guest. >>> Yes, but I think that is so only because of the block structure bug >>> introduced in abfb006f. >> OK, if this code is only for PV guests, this problem could also still be >> there for PV guests then (they can also enable MSI AFAIK). >> >> What i saw was that when a guest enables MSI for a device, the value of >> /sys/bus/pci/devices/<BDF>/irq in dom0 changes from the intx to the msi irq >> value. This occurs after libxl has read the irq and has called: >> >> When shutting down the guest disables MSI for that device and the value of >> /sys/bus/pci/devices/<BDF>/irq in dom0 is changed back to the intx irq before >> libxl reads it again and calls: >> >> So there is no error. >> However in the destroy case the guest doesn't disable MSI, so the value of >> /sys/bus/pci/devices/<BDF>/irq in dom0 isn't changed, so libxl tries to >> operate >> on the wrong value. >> >> I don't know if it is a kernel bug that /sys/bus/pci/devices/<BDF>/irq >> changes >> when MSI are enabled, or that libxl shouldn't depend on that value. > I am unable to reproduce irq value change. But I can see that if it > changes you will see libxl complain since we'd be trying to unmap an irq > that has not been mapped. > What device are you passing through? Will look at it again this weekend and report back the details of device, used irq's, kernel version etc, so you can hopefully reproduce. >> >>>> 2) When setting and updating the device states in xenstore for >>>> a[...] >>> The rest of your message is difficult to read on my screen due to wrap >>> damage. Can you wrap it to <75 characters ? >> Hmm it could do with some restructuring an rephrasing as well, sorry for >> that, >> hope it's a bit clearer now (combined note 2 and 3): >> >> 2) When setting and updating the device states in xenstore for >> a passed through device, libxl doesn't mimic the states that >> pciback and pcifront set and use in their "dance", very well. >> (see f.e. >> http://lists.xen.org/archives/html/xen-devel/2014-08/msg00970.html ) >> >> When a HVM guest has multiple pci devices passed through, >> pciback doesn't currently get a signal *what* pci-device has >> been removed from the guest on a 'xl pci-detach'. >> >> Pciback currently only has a watch on the whole "pci-root" >> of a guest in xenstore, so it only knows *something* has changed: >> - in or under the "pci-root" entry >> - In the PV guest case, the xenbus-state of a individual pci device >> node is set to a different state by pcifront, so pciback can take >> action on cleaning up / resetting this particular physical device. >> - In the HVM guest case however, the xenbus-states are don't mimic >> what pcifront does, so pciback doesn't do all the clean up and >> resetting on the physical device when doing a "xl pci-detach". >> It *does* cleanup when *all* devices and there for the complete" >> "pci-root" entry for a device is yanked out of xenstore. That's why >> you don't get intro trouble when you shutdown a guest with multiple >> devices passed through. You also don't get into trouble when a >> single device is passed through and removed (the "pci-root" entry in >> xenstore is removed when the last pci device of a guest is removed >> from the guest. >> But you *do* get into trouble when hot-unplugging >> any but the last device from a guest which has multiple pci devices >> passed through to it. You also won't notice it while it is still >> owned by pciback. You do notice it when you try to do a >> "xl pci-assignable-remove". > I don't know about detach but I apparently can't even properly attach a > second device --- I get complaints about it already being in xenstore. > But device does show up in the guest. > And then I can't remove it because, well, it's not in xenstore. > Maybe it has something to do with the fact that both devices are virtual > functions of the same physical device. I might look at this sometime > next week. Ah .. virtual function as in virtual functions of a SR-IOV NIC ? or the functions of a multi-function device (the later ends up as separate devices in a HVM guest (assuming you use qemu-xen and not qemu-traditional)). Ok wil keep an eye on the mailing list if i can test other help in any other way then ! Thx ! Sander > -boris >> >> I tried to fix this my self .. but ran into trouble because when you >> signal pciback via xenstore of the intent of removing a device from the >> guest, you need a callback but you can also run in to "timeout" issues. >> Another issue was that on shutdown you will remove multiple devices in >> short succession and i ran into locking/race issues, so it clearly >> became something beyond my skills at that point. >> >>> Thanks, >>> Ian. >> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |