[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V13 3/5] libxl: add pvusb API



On Tue, Jan 19, 2016 at 3:48 PM, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
> Chunyan Liu writes ("[PATCH V13 3/5] libxl: add pvusb API"):
>> +static int usbback_dev_unassign(libxl__gc *gc, const char *busid)
>> +{
> ...
>> +    /* Till here, USB device has been unbound from USBBACK and
>> +     * removed from xenstore, usb list couldn't show it anymore,
>> +     * so no matter removimg driver path successfully or not,
>> +     * we will report operation success.
>> +     */
>
> I'm still unconvinced by this and this may mean that the code in this
> function is in the wrong order.  Earlier we had this exchange:
>
>> > Ought this function to really report success if these calls fail ?
>>
>> I think so. Till here, the USB device has already been unbound from
>> usbback and removed from xenstore. usb-list cannot list it any more.
>
> The problem is that I think that if this function fails, it can leave
>  - debris in xenstore (the usbback path)
>  - the interface bound to the wrong driver
> And then there is no way for the user to get libxl to re-attempt the
> operation, or clean up.  Am I right ?
>
> One way to avoid this kind of problem is to deal with the xenstore
> path last.  That way the device will still appear as attached to the
> domain.
>
> To do this properly I think bind_usbintf may need to become idempotent
> even in the face of other callers racing to assign the device.  I
> think that would mean bind_usbif it would have to know what driver to
> expect to find the device bound to.
>
> George, am I right here ?

There are effectively four states a device can be in, from the
'assignment' point of view:

1. Assigned to the normal Linux device driver(s) for the USB device

2. Assigned to no driver

3. Assigned to usbback, but not yet assigned to any guest

4. Assigned to a guest

Regardless of the state, the USB device is visible to lsusb, and
always exists in the usb controller's device heirarchy, regardless of
what driver it's assigned to (if any).

As far as sysfs and lspci goes, pci is an exact analog.

For usb, libxl really only has two states: assigned (state 4) and
unassigned (states 1 or 2).  libxl__device_usbdev_add() will take
devices from states 1 or 2 and move them into state 3, then state 4.
libxl__device_usbdev_remove() will take devices from state 4, through
state 3, and then to state 2 (and then possibly state 1, if it has the
appropriate information).  usbdev_list will only list things in state
4.

Additionally, each USB "device" has one or more "interfaces".  To
assign a "device" to usbback in the sysfs case means assigning all of
the "interfaces".  The code seems to assume that different interfaces
from the same device can be assigned to different drivers.

For comparison, the libxl pci pass-through code has three states: not
assignable (states 1 or 2), assignable (state 3), and assigned (state
4).  libxl__device_pci_assignable_add moves it from 1 or 2 to state 3;
libxl_device_pci_add moves it from state 3 to 4 (unless the "seize"
flag is set, in which case it will call
libxl__device_pci_assignable_add if necessary).
libxl_device_pci_assignable_list lists things in state 3;
libxl_device_pci_list list things in state 4.

pci devices also have analogs called "functions"; but libxl treats
each of these separately (i.e., you actually assign a "function" to a
VM, not a "device").  If a device has several functions which all need
to be assigned, it's up to the user to assign each one individually.

So for removal, we have to consider what to do with a failure at each
of the steps.

* 4 -> 3.  This is just removing it from xenstore.  The xenstore
removal is transactional, so (I think) a failure here should mean that
it's still in state 4; it should still show up in usb_list, and the
user could try removing again if she wants.

* 3 -> 2. At this point, what the code actually does is try to unbind
each interface from usbback; if any of the calls to unbind_usbintf()
fail, it will give up (leaving any remaining functions bound to
usbback).

* 2 -> 1.  If we stored a driver path in xenstore (under the /libxl
node, not under the pvusb node), we then try to rebind each interface
to that driver.  If this binding fails, we simply print a warning and
continue (i.e., we attempt to re-bind with all interfaces, even if one
of them fails).

Regarding Ian's comments:

Since "assigned to the guest" and "listed under the pvusb xenstore
node" are the same thing, is it even *possible* to (safely) unassign
the device from usbback before removing the xenstore nodes?

It's not clear to me under what conditions 3->2 might fail, or what
could be done about it if it did fail.  Perhaps removing the usbback
module?  Failing that, rebooting the system is the only recovery
option I can think of.  Certainly trying to assign it back to the
guest (i.e., move it back to the previous legal state of '4') isn't a
good idea.

I do agree, however, that stopping in the middle here (leaving some
interfaces assigned to usbback and some potentially already
re-assigned to the original driver) is a bad; as is leaving the original
driver for the interface in xenstore.

Regarding 2->1, again it's not clear that there's anything libxl can
do.  Reloading the driver's module might reset it enough to pick up
the (now unassigned) USB devices again; other than that, rebooting is
probably the best option.

It's also not clear to me, if some functions succeed in being
reassigned and others fail, whether it's best to just try to assign as
many as we can, or better to go back and un-assign all the ones that
succeeded.

Perhaps the best approach code-wise is to change the "goto out" on
failure of unbind_usbintf() into a "continue".  That way:

1. All interfaces which can be re-assigned are re-assigned (and work
as much as possible)

2. All interfaces which can be unbound but not re-assigned are at
least unbound (so that reloading the original driver might pick them
up)

3. The xenstore paths storing the original driver information are
cleaned up.

And we should of course include some useful error messages to give the
user a hint as to what they can do to fix things; perhaps:

"ERROR: Unexpected failure, device now in wedged state (partially
assigned to usbback), don't know how to recover.  Try unloading and
reloading the usbback module, or rebooting the system."

"ERROR: Unexpected failure re-assigning device, don't know how to
recover.  Try unloading and reloading $MODULE, or rebooting the
system."

Thoughts?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.