[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 16/18] xen: automatically create XenBlockDevice-s



> -----Original Message-----
> From: Kevin Wolf [mailto:kwolf@xxxxxxxxxx]
> Sent: 13 December 2018 11:52
> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
> Cc: qemu-devel@xxxxxxxxxx; qemu-block@xxxxxxxxxx; xen-
> devel@xxxxxxxxxxxxxxxxxxxx; Max Reitz <mreitz@xxxxxxxxxx>; Stefano
> Stabellini <sstabellini@xxxxxxxxxx>
> Subject: Re: [PATCH v4 16/18] xen: automatically create XenBlockDevice-s
> 
> Am 11.12.2018 um 16:57 hat Paul Durrant geschrieben:
> > This patch adds a creator function for XenBlockDevice-s so that they can
> > be created automatically when the Xen toolstack instantiates a new
> > PV backend. When the XenBlockDevice is created this way it is also
> > necessary to create a drive which matches the configuration that the Xen
> > toolstack has written into xenstore. This drive is marked 'auto_del' so
> > that it will be removed when the XenBlockDevice is destroyed. Also, for
> > compatibility with the legacy 'xen_disk' implementation, an iothread
> > is automatically created for the new XenBlockDevice. This will also be
> > removed when the XenBlockDevice is destroyed.
> >
> > Correspondingly the legacy backend scan for 'qdisk' is removed.
> >
> > After this patch is applied the legacy 'xen_disk' code is redundant. It
> > will be removed by a subsequent patch.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
> > Reviewed-by: Anthony Perard <anthony.perard@xxxxxxxxxx>
> 
> So I have two points for this patch.
> 
> The first is that devices creating their own backends feels so wrong.

Indeed it does feel wrong, but this is the only way to deal with existing Xen 
toolstacks. Once toolstacks have been ported to using QMP to instantiate 
backends then this code can go away.

> I
> know that the old xen_disk did the same, and fixing it might neither be
> easy nor directly related to the qdevification, so requiring that from
> you would probably be unfair. But I still have to make the note, and
> hopefully we can get to it eventually (or maybe it is even easy enough
> that we can indeed address it in this series).
> 

No, it's not easy but once this series is committed the work can be started.

> My problem here is that I don't really understand the Xen mechanisms.
> Could you give me a very high-level overview of how adding a disk works
> and which component communicates with which other component to get the
> information down to QEMU and eventually the newly added
> xen_block_device_create()?

Xen toolstacks instantiate PV backends by just writing values into xenstore. It 
is up to the entity implementing the backend to set 'watches' so that it gets 
notified when these values appear. Currently that entity may be QEMU, or it may 
be a kernel driver such as Linux xen-blkback.ko.

> 
> Essentially, what I'm wondering is whether we have anything that could
> be treated more or less like another monitor besides QMP and HMP, which
> would internally work similar to HMP, i.e. map (almost) everything to
> QMP commands.

Yes, it would be possible to have a separate 'compatibility' daemon to watch 
xenstore and then formulate the correct sequence of QMP commands to instantiate 
the backend, but that is more complicated and the right answer of course is to 
have the toolstack send the QMP commands in the first place.

> I see that there is this XenWatch infrastructure to get
> notified about changes (which would be treated like monitor commands),
> but I'm not sure if everything would be covered by this mechanism or
> whether some things must be fetched explicitly.
> 
> Anyway, this is probably for later.
> 
> > +static void xen_block_drive_create(const char *id, const char
> *device_type,
> > +                                   QDict *opts, Error **errp)
> > +{
> > +    const char *params = qdict_get_try_str(opts, "params");
> > +    const char *mode = qdict_get_try_str(opts, "mode");
> > +    const char *direct_io_safe = qdict_get_try_str(opts, "direct-io-
> safe");
> > +    const char *discard_enable = qdict_get_try_str(opts, "discard-
> enable");
> > +    char *format = NULL;
> > +    char *file = NULL;
> > +    char *drive_optstr = NULL;
> > +    QemuOpts *drive_opts;
> > +    Error *local_err = NULL;
> > +
> > +    if (params) {
> > +        char **v = g_strsplit(params, ":", 2);
> > +
> > +        if (v[1] == NULL) {
> > +            file = g_strdup(v[0]);
> > +        } else {
> > +            if (strcmp(v[0], "aio") == 0) {
> > +                format = g_strdup("raw");
> > +            } else if (strcmp(v[0], "vhd") == 0) {
> > +                format = g_strdup("vpc");
> > +            } else {
> > +                format = g_strdup(v[0]);
> > +            }
> > +            file = g_strdup(v[1]);
> > +        }
> > +
> > +        g_strfreev(v);
> > +    }
> > +
> > +    if (!file) {
> > +        error_setg(errp, "no file parameter");
> > +        return;
> > +    }
> > +
> > +    drive_optstr = g_strdup_printf("id=%s", id);
> > +    drive_opts = drive_def(drive_optstr);
> > +    if (!drive_opts) {
> > +        error_setg(errp, "failed to create drive options");
> > +        goto done;
> > +    }
> > +
> > +    qemu_opt_set(drive_opts, "file", file, &local_err);
> > +    if (local_err) {
> > +        error_propagate_prepend(errp, local_err, "failed to set 'file':
> ");
> > +        goto done;
> > +    }
> > +
> > +    qemu_opt_set(drive_opts, "media", device_type, &local_err);
> > +    if (local_err) {
> > +        error_propagate_prepend(errp, local_err,
> > +                                "failed to set 'media': ");
> > +        goto done;
> > +    }
> > +
> > +    if (format) {
> > +        qemu_opt_set(drive_opts, "format", format, &local_err);
> > +        if (local_err) {
> > +            error_propagate_prepend(errp, local_err,
> > +                                    "failed to set 'format': ");
> > +            goto done;
> > +        }
> > +    }
> > +
> > +    if (mode && *mode != 'w') {
> > +        qemu_opt_set_bool(drive_opts, BDRV_OPT_READ_ONLY, true,
> &local_err);
> > +        if (local_err) {
> > +            error_propagate_prepend(errp, local_err, "failed to set
> '%s': ",
> > +                                    BDRV_OPT_READ_ONLY);
> > +            goto done;
> > +        }
> > +    }
> > +
> > +    /*
> > +     * It is necessary to turn file locking off as an emulated device
> > +     * my have already opened the same image file.
> > +     */
> > +    qemu_opt_set(drive_opts, "file.locking", "off", &local_err);
> > +    if (local_err) {
> > +        error_propagate_prepend(errp, local_err,
> > +                                "failed to set 'file.locking': ");
> > +        goto done;
> > +    }
> > +
> > +    qemu_opt_set_bool(drive_opts, BDRV_OPT_CACHE_WB, true, &local_err);
> > +    if (local_err) {
> > +        error_propagate_prepend(errp, local_err, "failed to set '%s':
> ",
> > +                                BDRV_OPT_CACHE_WB);
> > +        goto done;
> > +    }
> > +
> > +    if (direct_io_safe) {
> > +        qemu_opt_set_bool(drive_opts, BDRV_OPT_CACHE_DIRECT, true,
> > +                          &local_err);
> > +        if (local_err) {
> > +            error_propagate_prepend(errp, local_err, "failed to set
> '%s': ",
> > +                                    BDRV_OPT_CACHE_DIRECT);
> > +            goto done;
> > +        }
> > +
> > +        qemu_opt_set(drive_opts, "aio", "native", &local_err);
> > +        if (local_err) {
> > +            error_propagate_prepend(errp, local_err,
> > +                                    "failed to set 'aio': ");
> > +            goto done;
> > +        }
> > +    }
> > +
> > +    if (discard_enable) {
> > +        unsigned long value;
> > +
> > +        if (!qemu_strtoul(discard_enable, NULL, 2, &value)) {
> > +            qemu_opt_set_bool(drive_opts, BDRV_OPT_DISCARD, !!value,
> > +                              &local_err);
> > +            if (local_err) {
> > +                error_propagate_prepend(errp, local_err,
> > +                                        "failed to set '%s': ",
> > +                                        BDRV_OPT_DISCARD);
> > +                goto done;
> > +            }
> > +        }
> > +    }
> > +
> > +    drive_new(drive_opts, IF_NONE, &local_err);
> > +    if (local_err) {
> > +        error_propagate_prepend(errp, local_err,
> > +                                "failed to create drive: ");
> > +        goto done;
> > +    }
> 
> The other major point is that you're using the legacy drive_*()
> infrastructure, which should not only go away as soon as we can, but
> which is also full of magic and nasty surprises.
> 
> I think the best way would be to create only a block node
> (BlockDriverState) here, and get an automatically created anonymous
> BlockBackend from the qdev drive property.
> 
> There are two ways to achieve this: qmp_blockdev_add() would be optimal
> because that's a stable external interface. It would require you to
> specify a node-name (you already have the id parameter), and you'd use
> this node-name for the qdev drive property.
> 
> qmp_blockdev_add() requires a BlockdevOptions object, which you can
> either construct manually in C or use a visitor to convert from an
> options QDict. Maybe in this case, converting from a QDict is better
> because otherwise you need special code for each block driver.
> 

I was using the legacy interfaces because this code is, as I said above, 
supposed to be a mechanism only required for compatibility with the way 
toolstacks currently operate (and so is essentially 'legacy') but using the 
top-level QMP entry point to construct the does sound do-able as long as the 
underlying file locking can still be avoided with that mechanism. Since 
BlockdevOptions seems to be an auto-generated structure, figuring out how to 
fill it in manually is somewhat tricky so the QDict approach is preferable but 
I'll have to figure out how to use a visitor to do the translation.


> The other way would be calling bdrv_open() directly, which gives you a
> BlockDriverState, but it risks using legacy functionality that will be
> deprecated soon. Again, you'd take the node-name and pass it to the qdev
> drive option below.

Yes, xen_disk does things this way but then we end up with legacy block device 
and still fall foul of the assertions buried in the code.

> 
> > +
> > +done:
> > +    g_free(drive_optstr);
> > +    g_free(format);
> > +    g_free(file);
> > +}
> > +
> > +static void xen_block_device_create(BusState *bus, const char *name,
> > +                                    QDict *opts, Error **errp)
> > +{
> > +    unsigned long number;
> > +    const char *vdev, *device_type;
> > +    BlockBackend *blk = NULL;
> > +    IOThread *iothread = NULL;
> > +    DeviceState *dev = NULL;
> > +    Error *local_err = NULL;
> > +    const char *type;
> > +    XenBlockDevice *blockdev;
> > +
> > +    trace_xen_block_device_create(name);
> > +
> > +    if (qemu_strtoul(name, NULL, 10, &number)) {
> > +        error_setg(errp, "failed to parse name '%s'", name);
> > +        return;
> > +    }
> > +
> > +    vdev = qdict_get_try_str(opts, "dev");
> > +    if (!vdev) {
> > +        error_setg(errp, "no dev parameter");
> > +        return;
> > +    }
> > +
> > +    device_type = qdict_get_try_str(opts, "device-type");
> > +    if (!device_type) {
> > +        error_setg(errp, "no device-type parameter");
> > +        return;
> > +    }
> > +
> > +    if (!strcmp(device_type, "disk")) {
> > +        type = TYPE_XEN_DISK_DEVICE;
> > +    } else if (!strcmp(device_type, "cdrom")) {
> > +        type = TYPE_XEN_CDROM_DEVICE;
> > +    } else {
> > +        error_setg(errp, "invalid device-type parameter '%s'",
> device_type);
> > +        return;
> > +    }
> > +
> > +    xen_block_drive_create(vdev, device_type, opts, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        return;
> > +    }
> > +
> > +    blk = blk_by_name(vdev);
> > +    g_assert(blk);
> > +
> > +    iothread = iothread_create(vdev, &local_err);
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        goto unref;
> > +    }
> > +
> > +    dev = qdev_create(bus, type);
> > +    blockdev = XEN_BLOCK_DEVICE(dev);
> > +
> > +    qdev_prop_set_string(dev, "vdev", vdev);
> > +    if (blockdev->vdev.number != number) {
> > +        error_setg(errp, "invalid dev parameter '%s'", vdev);
> > +        goto unref;
> > +    }
> > +
> > +    qdev_prop_set_drive(dev, "drive", blk, &local_err);
> > +    if (local_err) {
> > +        error_propagate_prepend(errp, local_err, "failed to set
> 'drive': ");
> > +        goto unref;
> > +    }
> 
> So here you would need to use something like this:
> 
> object_property_set_str(OBJECT(dev), vdev, "driver", &local_err);
> 
> > +
> > +    blockdev->auto_iothread = iothread;
> > +
> > +    object_property_set_bool(OBJECT(dev), true, "realized",
> &local_err);
> > +    if (local_err) {
> > +        error_propagate_prepend(errp, local_err,
> > +                                "initialization of device %s failed: ",
> > +                                type);
> > +        goto unref;
> > +    }
> > +
> > +    blockdev_mark_auto_del(blk);
> 
> You don't need this one any more then (if you look into the details,
> it's one of the more confusing parts of the drive_*() magic, so it's
> good to get rid of it). When you use the anonymous BlockBackend created
> by the qdev drive property (because you passed it a node-name rather
> than a BlockBackend name) means that the BlockBackend disappears
> together with the drive.
> 

Ok.

> Note that explicitly created block nodes must also be unreferenced
> explicitly (bdrv_open() should be paired with bdrv_unref() and
> qmp_blockdev_add() with qmp_blockdev_del()). Maybe XenBackendInfo needs
> a .destroy callback so we can do destruction symmetrically to device
> creation?
> 

Yes, I'd probably just add a callback function pointer into XenDevice which 
only gets set for devices instantiated via this mechanism.

  Paul

> > +    return;
> > +
> > +unref:
> > +    if (dev) {
> > +        object_unparent(OBJECT(dev));
> > +    }
> > +
> > +    if (iothread) {
> > +        iothread_destroy(iothread);
> > +    }
> > +
> > +    if (blk) {
> > +        monitor_remove_blk(blk);
> > +        blk_unref(blk);
> > +    }
> > +}
> 
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.