[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v12 3/7] remus: introduce remus device



Hi Ian,

On 06/28/2014 01:38 AM, Ian Jackson wrote:
Yang Hongyang writes ("[PATCH v12 3/7] remus: introduce remus device"):
introduce remus device, an abstract layer of remus devices(nic, disk,
etc).It provides the following APIs for libxl:

Thanks for this.  Things are much clearer for me now.  I have some
more detailed comments.

+struct libxl__remus_device_ops {
+    /*
+     * init() and destroy() APIs are produced by a device type and
+     * consumed by the main remus code, a device type must implement
+     * these two APIs.
+     */
+    /* init device ops private data, etc. must implement */
+    int (*init)(libxl__remus_device_ops *self,
+                libxl__remus_state *rs);
+    /* free device ops private data, etc. must implement */
+    void (*destroy)(libxl__remus_device_ops *self);
+    /*
+     * This is device ops's private data, for different device types,
+     * the data structs are different
+     */
+    void *data;

I see that in 4/7 you use this to store global state; it's not const.

libxl is not supposed to to have any global state that's not hung off
the libxl_ctx (or some similar) structure.  So I think things like the
NIC data need to be in libxl_ctx.

You have two reasonable options, I think: either, include the
variables you need directly in the libxl_ctx.  Or, make libxl_ctx
contain a pointer to a struct which is declared in libxl_internal.h
but only defined in (say) libxl_netbuffer.c.  The latter is probably
better because it more easily allows different
platorms'/implementations' versions of the same device kind to have
different variables.

It is IMO fine for the different devices to each have their own
member in libxl_ctx.


All the libxl__remus_device_ops structs need to be const.


Taking as an example:

+    if (rds->num_devices == rds->num_setuped)

Can you please write "num_set_up" ?  I'm afraid that "setuped" isn't a
word in English and it reads very oddly.  Thanks.


+    /*
+     * This is for matching, we must go through all device ops until we
+     * find a matched op for the device. The ops_index record which ops
+     * we are matching.
+     */
+    int ops_index;
+    libxl__remus_device_ops *ops;
+    libxl__remus_device_callback *callback;
+    libxl__remus_device_state *rds;

I think this is confusing.  Are all of these private for the abstract
layer ?  I think the remus concrete device ought to be allowed to use
libxl__remus_device_ops (which needs to be const).

It is more important to say who owns a struct field (who is
allowed to read it; who is responsible for setting it, etc.) than to
say precisely what it is for.

If you're going to have both kinds of comments in the same struct
(that is, both sections which have particular ownership and individual
usage comments) it would be helpful to make ownership section comments
more obvious.

Perhaps something like:

   struct libxl__remus_device {
       /*----- shared between abstract and concrete layers -----*/
       /* set by remus device abstruct layer */
       int devid;
       /* libxl__device_* which this remus device related to */
       const void *backend_dev;
       libxl__remus_device_kind kind;

       /*----- private for abstract layer only -----*/
       /*
        * This is for matching, we must go through all device ops until we
        * find a matched op for the device. The ops_index record which ops
        * we are matching.
        */
       int ops_index;
       libxl__remus_device_ops *ops;
       libxl__remus_device_callback *callback;
       libxl__remus_device_state *rds;

       /*----- private for concrete (device-specific) layer -----*/
       /* *kind* of device's private data */
       void *data;
       /* for calling scripts, eg. setup|teardown|match scripts */
       libxl__async_exec_state aes;
       /*
        * for async func calls, in the implementation of device ops, we
        * may use fork to do async ops. this is owned by device-specific
        * ops methods
        */
       libxl__ev_child child;
   };

Writing it like that shows another anomaly: why do we have fields in
the libxl__remus_device struct that are private for the device
specific layer ?

"aes" and "child" are common fields for device specific layer,
it's common for an async device. We define them here
in order to not define it in every device specific implementation.


After all the device-specific layer has "data", which could contain
anything it likes.



Why is the libxl__remus_device_state a separate structure ?
Can it be combined with libxl__remus_device_state ?


+static libxl__remus_device_ops *dev_ops[] = {
+};

As I say, this needs to be const.


+static void device_common_cb(libxl__egc *egc,
+                             libxl__remus_device *dev,
+                             int rc)

When we write this kind of callback threaded code, we write it in
chronological order.  That is:
  - banner comment at the top
  - necessary forward declarations (and data types)
  - entrypoint
  - callbacks, in order in which they are called
  - banner comment for next section

See for example the utilities in libxl_aoutils.c.

Or for a bigger example see libxl_bootloader.c, in particular the
section after the comment "main flow of control" near line 304.

+static __attribute__((unused)) void libxl__remus_device_init(libxl__egc *egc,
+                                     libxl__remus_device_state *rds,
+                                     libxl__remus_device_kind kind,
+                                     void *libxl_dev)
...
+    switch (kind) {
+        case LIBXL__REMUS_DEVICE_NIC:
+            nic = libxl_dev;
+            dev->devid = nic->devid;
+            break;
+        case LIBXL__REMUS_DEVICE_DISK:
+            disk = libxl_dev;
+            /* there are no dev id for disk devices */
+            dev->devid = -1;
+            break;

Wouldn't it be better to move this into the concrete (device type
specific) code ?  After all you're switching on the device type
already, and the device type specific code has all the same
information ?

Or am I missing something ?

+void libxl__remus_device_teardown(libxl__egc *egc, libxl__remus_state *rs)
+{
+    int i;
+    libxl__remus_device *dev;
+    libxl__remus_device_ops *ops;
+
+    STATE_AO_GC(rs->ao);
+
+    /* Convenience aliases */
+    libxl__remus_device_state *const rds = &rs->dev_state;
+
+    if (rds->num_setuped == 0) {
+        /* clean device ops */
+        for (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
+            ops = dev_ops[i];
+            ops->destroy(ops);
+        }
+        goto out;
+    }

I was a bit confused by the nonlinear flow of control in this file,
but it is definitely clear to me that you have two copies of this
loop.

Why don't you just make the callback to your own callback function
yourself (under appropriate conditions) ?

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 1018142..cc5d390 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -43,6 +43,7 @@ libxl_error = Enumeration("error", [
      (-12, "OSEVENT_REG_FAIL"),
      (-13, "BUFFERFULL"),
      (-14, "UNKNOWN_CHILD"),
+    (-15, "NOT_MATCH"),

I still think this probably wants to be a better error message.

REMUS_UNSUPPORTED_DEVICE ?

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
.


--
Thanks,
Yang.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.