[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] libxl: Wait until QEMU removed the device before tearing it down

Friday, November 14, 2014, 7:07:46 PM, you wrote:

> Sander Eikelenboom writes ("Re: [Xen-devel] [PATCH 1/2] libxl: Wait until 
> QEMU removed the device before tearing it down"):
>> 1) xc_physdev_unmap_pirq does get called when destroying a HVM guest.

> Yes, but I think that is so only because of the block structure bug
> introduced in abfb006f.

OK, if this code is only for PV guests, this problem could also still be
there for PV guests then (they can also enable MSI AFAIK).

What i saw was that when a guest enables MSI for a device, the value of 
/sys/bus/pci/devices/<BDF>/irq in dom0 changes from the intx to the msi irq 
value. This occurs after libxl has read the irq and has called:

When shutting down the guest disables MSI for that device and the value of 
/sys/bus/pci/devices/<BDF>/irq in dom0 is changed back to the intx irq before
libxl reads it again and calls:

So there is no error.
However in the destroy case the guest doesn't disable MSI, so the value of
/sys/bus/pci/devices/<BDF>/irq in dom0 isn't changed, so libxl tries to operate 
on the wrong value.

I don't know if it is a kernel bug that /sys/bus/pci/devices/<BDF>/irq changes 
when MSI are enabled, or that libxl shouldn't depend on that value.

>> 2) When setting and updating the device states in xenstore for
>> a[...]

> The rest of your message is difficult to read on my screen due to wrap
> damage.  Can you wrap it to <75 characters ?

Hmm it could do with some restructuring an rephrasing as well, sorry for that,
hope it's a bit clearer now (combined note 2 and 3):

2) When setting and updating the device states in xenstore for
   a passed through device, libxl doesn't mimic the states that 
   pciback and pcifront set and use in their "dance", very well.
   (see f.e. http://lists.xen.org/archives/html/xen-devel/2014-08/msg00970.html 

   When a HVM guest has multiple pci devices passed through,
   pciback doesn't currently get a signal *what* pci-device has 
   been removed from the guest on a 'xl pci-detach'.
   Pciback currently only has a watch on the whole "pci-root" 
   of a guest in xenstore, so it only knows *something* has changed:
   - in or under the "pci-root" entry
   - In the PV guest case, the xenbus-state of a individual pci device
     node is set to a different state by pcifront, so pciback can take
     action on cleaning up / resetting this particular physical device.
   - In the HVM guest case however, the xenbus-states are don't mimic
     what pcifront does, so pciback doesn't do all the clean up and
     resetting on the physical device  when doing a "xl pci-detach".
     It *does* cleanup when *all* devices and there for the complete"
     "pci-root" entry for a device is yanked out of xenstore. That's why
     you don't get intro trouble when you shutdown a guest with multiple
     devices passed through. You also don't get into trouble when a 
     single device is passed through and removed (the "pci-root" entry in
     xenstore is removed when the last pci device of a guest is removed
     from the guest. 
     But you *do* get into trouble when hot-unplugging
     any but the last device from a guest which has multiple pci devices
     passed through to it. You also won't notice it while it is still
     owned by pciback. You do notice it when you try to do a 
     "xl pci-assignable-remove".

     I tried to fix this my self .. but ran into trouble because when you
     signal pciback via xenstore of the intent of removing a device from the 
     guest, you need a callback but you can also run in to "timeout" issues.
     Another issue was that on shutdown you will remove multiple devices in
     short succession and i ran into locking/race issues, so it clearly 
     became something beyond my skills at that point.

> Thanks,
> Ian.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.