[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 06/13] libxl: convert libxl_device_disk_add to an async op



Roger Pau Monne writes ("[PATCH v6 06/13] libxl: convert libxl_device_disk_add 
to an async op"):
> This patch converts libxl_device_disk_add to an ao operation that
> waits for device backend to reach state XenbusStateInitWait and then
> marks the operation as completed. This is not really useful now, but
> will be used by latter patches that will launch hotplug scripts after
                  ^^^^^^
later

> we reached the desired xenbus state.

Thanks, here are my review comments:


> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 85c21b4..937ab08 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
...
> +/* AO operation to connect a disk device, called by
> + * libxl_device_disk_add and libxl__add_disks. This function calls
> + * libxl__initiate_device_connection to wait for the device to
> + * finish the connection (might involve executing hotplug scripts).
> + */
> +_hidden void libxl__device_disk_add(libxl__egc *egc, uint32_t domid,
> +                                    xs_transaction_t t,
> +                                    libxl_device_disk *disk,
> +                                    libxl__ao_device *aodev);
> +

This is a confusing doc comment.  The reader doesn't want to know how
the function is implemented; we need to know what it does.
Particularly, we need to know what happens when it completes.  I infer
that it calls aodev->callback but this should be stated.

> +/* Waits for the passed device to reach state XenbusStateInitWait.
> + * This is not really useful by itself, but is important when executing
> + * hotplug scripts, since we need to be sure the device is in the correct
> + * state before executing them.
> + *
> + * Once finished, aodev->callback will be executed.
> + */
> +_hidden void libxl__initiate_device_connection(libxl__egc*,
> +                                               libxl__ao_device *aodev);

This function's name and its functionality seem not to correspond.  It
doesn't actually initiate anything, as far as I can tell.

>  /* First layer; wraps libxl__spawn_spawn. */
> @@ -2033,6 +2068,8 @@ typedef struct {
>      libxl__domain_build_state dm_state;
>      libxl__dm_spawn_state pvqemu;
>      libxl__destroy_domid_state dis;
> +    /* used to store the state of devices being connected */
> +    libxl__ao_devices aodevs;
>  } libxl__stub_dm_spawn_state;

I'm not sure how valuable these comments are.  libxl__ao_devices are
always used to store the state of devices being "<something>'d".
That's what a libxl__ao_devices is for.  And in the context of a spawn
or create that would obviously be "connected".

In general I'm a fan of comments which say things which aren't clear
from the code, but I'm not keen on ones which restate what the code
says.


> diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
> index 9243806..5d34ed5 100644
> --- a/tools/libxl/libxl_device.c
> +++ b/tools/libxl/libxl_device.c
> @@ -442,6 +442,45 @@ void libxl__ao_devices_callback(libxl__egc *egc, 
> libxl__ao_device *aodev)
>      return;
>  }
> 
> +/******************************************************************************/
> +
> +/* Macro for defining the functions that will add a bunch of disks when
> + * inside an async op.
> + *
> + * This macro is added to prevent repetition of code.
> + */
> +
> +/* Capatibility macro, since disk_add now takes a xs transaction parameter */
> +#define libxl__device_disk_add(egc, domid, disk, aodev)                      
>  \
> +        libxl__device_disk_add(egc, domid, XBT_NULL, disk, aodev)

Is it really safe to call libxl__device_disk_add without a real
transaction ?  I see that the current code does it but it seems wrong
to me.  We end up writing all of the individual settings in the
backend one at a time.

I think this applies to all the other device types too.  So I think
the right answer would be to make them take a transaction parameter
too.  Ie, insert a patch before this one which adds a transaction
parameter (ignored for now and always passed 0 if you don't want to
actually make it work properly) to libxl__add_nic etc.  Then you don't
need this unpleasant macro.

> +void libxl__initiate_device_connection(libxl__egc *egc, libxl__ao_device 
> *aodev)
> +{
> +    STATE_AO_GC(aodev->ao);
> +    char *be_path = libxl__device_backend_path(gc, aodev->dev);
> +    char *state_path = libxl__sprintf(gc, "%s/state", be_path);
> +    int rc = 0;
...
> +out_fail:
> +    assert(rc);
> +    aodev->rc = rc;
> +    device_xsentries_remove(egc, aodev);
> +    return;
> +}

Firstly, I'm not convinced it's really proper for
libxl__initiate_device_connection to call device_xsentries_remove.
After all device_xsentries_remove is part of the implementation of
libxl__initiate_device_remove.

And, secondly, does device_xsentries_remove really do what you want ?
It has a test in it which makes it only do its work if action is
DISCONNECT.


> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 43affd1..5f09740 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c


> @@ -2027,15 +2046,17 @@ static char * libxl__alloc_vdev(libxl__gc *gc, const 
> char *blkdev_start,

> +                dls->t = xs_transaction_start(ctx->xsh);
> +                if (dls->t == XBT_NULL) {
> +                    LOG(ERROR, "failed to start a xenstore transaction");
> +                    rc = ERROR_FAIL;
> +                    goto out;
> +                }
> +                disk->vdev = libxl__alloc_vdev(gc, blkdev_start, dls->t);
...
> +                libxl__device_disk_add(egc, LIBXL_TOOLSTACK_DOMID,
> +                                       dls->t, disk, &dls->aodev);
...
> +    xs_ret = xs_transaction_end(ctx->xsh, dls->t, 0);
> +    if (xs_ret == 0 && errno == EAGAIN) {
> +        libxl__device_disk_local_attach(egc, dls);
> +        return;

Isn't the effect of this that if the xs transaction gets a conflict,
we'll rerun the hotplug scripts, etc. ?  I think I may be confused
here, but I don't understand how this transaction loop is supposed to
work and how it is supposed to relate to the interaction with other
domains, scripts, etc.

Indeed it seems to me like your arrangements in libxl__device_disk_add
couldn't work if a non-null t was supplied.  libxl__device_disk_add
would do all the writes in the transaction, but not commit it, so that
none of them are visible to anyone, and then wait for the deivde state
to change, which will never happen.


>  static void libxl__device_disk_from_xs_be(libxl__gc *gc,
> @@ -1982,11 +1999,13 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t 
> domid, libxl_device_disk *disk)
>      ret = 0;
> 
>      libxl_device_disk_remove(ctx, domid, disks + i, 0);
> -    libxl_device_disk_add(ctx, domid, disk);
> +    /* fixme-ao */
> +    libxl_device_disk_add(ctx, domid, disk, 0);
>      stubdomid = libxl_get_stubdom_id(ctx, domid);
>      if (stubdomid) {
>          libxl_device_disk_remove(ctx, stubdomid, disks + i, 0);
> -        libxl_device_disk_add(ctx, stubdomid, disk);
> +        /* fixme-ao */
> +        libxl_device_disk_add(ctx, stubdomid, disk, 0);

I think this code is a symptom of a problem elsewhere.  When adding a
disk to a domain, it should not be necessary to explicitly ask to add
it to the stubdom too.  But this is not your fault, or your problem
right now.


Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.