[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 05/10] libxl: remove the Qemu bodge for driver domain devices

On 15/11/13 18:09, Ian Jackson wrote:
> Roger Pau Monne writes ("[PATCH v2 05/10] libxl: remove the Qemu bodge for 
> driver domain devices"):
>> When Qemu is launched from a driver domain to act as a PV disk
>> backend we can make sure that Qemu is running before detaching
>> devices, so there's no need for the bodge there.
> I'm confused.  I don't see why this change is safe.
> You say "we can make sure", but how ?  Do we actually make sure ?
> What part of the code is "we" ?

What I was trying to say here, is that on a normal domain destruction on
Dom0, libxl first signals Qemu to exit and then removes the devices, so
when libxl starts device removal we cannot be sure if Qemu is still
running or not (that's why the bodge is there, to give Qemu enough time
to finish it's pending work and exit).

On the other hand, when running Qemu (Qdisk) on a driver domain, libxl
removes the devices first, and then signals Qemu to exit, so Qemu is (or
should) always be running during the device removal phase.

>> @@ -879,17 +886,43 @@ static void device_qemu_timeout(libxl__egc *egc, 
>> libxl__ev_time *ev,
> ...
> And I don't understand how this next change relates to the above:
>> -    rc = libxl__xs_write_checked(gc, XBT_NULL, state_path, "6");
>> -    if (rc) goto out;
>> +    for (;;) {
>> +        rc = libxl__xs_transaction_start(gc, &t);
>> +        if (rc) {
>> +            LOG(ERROR, "unable to start transaction");
>> +            goto out;
>> +        }
>> +
>> +        /*
>> +         * Check that the state path exists and is actually different than
>> +         * 6 before unconditionally setting it. If Qemu runs on a driver
>> +         * domain it is possible that the driver domain has already cleaned
>> +         * the backend path if the device has reached state 6.
>> +         */
>> +        rc = libxl__xs_read_checked(gc, XBT_NULL, state_path, &xs_state);
>> +        if (rc) goto out;
> This is on the "we hope qemu is dying and that this 2.0s wait is
> enough" path from the diagram in libxl_internal.h (near line 1989) ?


> By "the driver domain" you mean "the driver domain's libxl device
> backend daemon" ?


> So AFAICT this is an unrelated bugfix, relating to the fact that if
> the state is set to 6 the backend daemon will remove the backend path,
> and setting it back to 6 is wrong.

Yes, but this was not possible before this series, and libxl could set
the backend path state to 6 unconditionally, because Qemu doesn't remove
the backend path on exit, so it's arguably that this was a bug before
this series.

This is not true any more, because we have the libxl driver backend
daemon that cleans the backend path on device removal, so setting state
to 6 from Dom0 after the libxl driver domain daemon has already removed
it is wrong.

> And in this case we aren't going to run the hotplug script because
> this only happens in the toolstack domain's code when there is a
> driver domain ?  Perhaps a more explicit check would be clearer.

No, libxl on Dom0 is not going to run the hotplug scripts, because this
backend is handled by another domain.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.