[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2 of 4 V5] tools/libxl: Remus network buffering - hotplug scripts and setup code



On Mon, Nov 25, 2013 at 9:32 AM, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
Shriram Rajagopalan writes ("[PATCH 2 of 4 V5] tools/libxl: Remus network buffering - hotplug scripts and setup code"):
> tools/libxl: Remus network buffering - hotplug scripts and setup code

Thanks.  This is going in the right direction, but I think there's
still some way to go.


One thing that makes reviewing it particularly difficult is the
physical ordering of the new code in your patch.

The callback programming model makes it easy to write spaghetti logic;
to counter this we try to have a firm structure to each set of event
handling code:

 * Firstly, function forward declarations, types, structs, etc.

 * Then helper functions which have a simple "call this and it
   does something and returns" control flow.

 * Then the main body of the event callbacks, in the order in
   which they will normally be called.  Obviously there is some
   judgement needed because the call flow won't always be entirely
   linear.  But keeping the layout as linear as possible means that
   one is never hunting for the next function.

 * Call flow of a single operation does not bounce back and forth
   between files.  If this is necessary a formal sub-operation is
   defined (see how it's done with the bootloader, for example).

I think you would be able to structure the code this way.  Would you
do that please ?

I think you may need to move several of the entrypoints into a new
"remus" file.  The "netbuf" file may need to become a more formally
separated part of the suspend/resume function.

We also try to break out sub-operations, giving them their own types
and callbacks.


> diff -r 4b471809a5bb -r 5b4e029bb79b tools/libxl/Makefile
> --- a/tools/libxl/Makefile    Mon Nov 18 10:17:35 2013 -0800
> +++ b/tools/libxl/Makefile    Mon Nov 18 11:09:34 2013 -0800
> @@ -42,6 +42,13 @@ LIBXL_OBJS-y += libxl_blktap2.o
> -static void remus_failover_cb(libxl__egc *egc,
> -                              libxl__domain_suspend_state *dss, int rc);
> +static void remus_replication_failure_cb(libxl__egc *egc,
> +                                         libxl__domain_suspend_state *dss,
> +                                         int rc);
...
> -    dss->remus = info;

This change from dss->remus to dss->remus_ctx needs a mention in the
commit message.  To be honest, I'm not sure why you don't just change
the type of the remus member.

> diff -r 4b471809a5bb -r 5b4e029bb79b tools/libxl/libxl_internal.h
> --- a/tools/libxl/libxl_internal.h    Mon Nov 18 10:17:35 2013 -0800
> +++ b/tools/libxl/libxl_internal.h    Mon Nov 18 11:09:34 2013 -0800
> @@ -2291,6 +2291,50 @@ typedef struct libxl__logdirty_switch {
>      libxl__ev_time timeout;
>  } libxl__logdirty_switch;
>
> +typedef struct libxl__remus_ctx {
> +    /* checkpoint interval */
> +    int interval;
> +    int blackhole;
> +    int compression;
> +    int saved_rc;
> +    /* Script to setup/teardown network buffers */
> +    const char *netbufscript;
> +    /* Opaque context containing network buffer related stuff */
> +    void *netbuf_ctx;
> +    libxl__domain_suspend_state *dss;
> +    libxl__ev_time timeout;
> +    libxl__ev_child child;
> +    int num_exec;
> +} libxl__remus_ctx;

I think you should probably call this a "libxl__remus_state", to
correspond to all of the other "libxl__foo_state" structs.

You also need to specify in comments who fills in what parts of it, as
is done in the other libxl__foo_state's.


> diff -r 4b471809a5bb -r 5b4e029bb79b tools/libxl/libxl_netbuffer.c
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/tools/libxl/libxl_netbuffer.c   Mon Nov 18 11:09:34 2013 -0800
> @@ -0,0 +1,516 @@
...
> +typedef struct libxl__remus_netbuf_ctx {
> +    struct rtnl_qdisc **netbuf_qdisc_list;
> +    struct nl_sock *nlsock;
> +    struct nl_cache *qdisc_cache;
> +    char **vif_list;
> +    char **ifb_list;
> +    uint32_t num_netbufs;
> +    uint32_t unused;
> +} libxl__remus_netbuf_ctx;

Presumably this wants to be a libxl__remus_netbuf_state, then.

The callbacks from the remus and netbuf code to the suspend code
should not be hardwired.  Instead, function pointers should be passed,
the way the domain creation stuff does with a bootloader, and the way
the bootloader does with openpty.

This may seem like extra complication if there is only one caller and
it will always pass the same function, but it makes the logical
separation much clearer.  It's analogous to the separation between an
ordinary function and its caller, even if there is in fact only one
call site.

That is, functions like this

+_hidden void libxl__remus_setup_done(libxl__egc *egc,
+                                     libxl__domain_suspend_state *dss,
+                                     int rc);

should be private functions in the remus file which make the
appropriate callback.


> +/* If the device has a vifname, then use that instead of
> + * the vifX.Y format.
> + */
> +static char *get_vifname(libxl__gc *gc, uint32_t domid,
> +                               libxl_device_nic *nic)
> +{
> +    const char *vifname = NULL;
> +    vifname = libxl__xs_read(gc, XBT_NULL,
> +                             libxl__sprintf(gc,
> +                                            "%s/backend/vif/%d/%d/vifname",
> +                                            libxl__xs_get_dompath(gc, 0),
> +                                            domid, nic->devid));

Why not libxl__xs_read_checked ?

> +    if (!vifname) {

Surely this error handling is wrong.  NULL here might mean "failed" or
ENOENT.  If it means failed then get_vifname needs to return the error
to its caller, by returning NULL.

> +        vifname = libxl__device_nic_devname(gc, domid,
> +                                            nic->devid,
> +                                            nic->nictype);
> +    }
> +
> +    return libxl__strdup(gc, vifname);

Why this strdup ?

> +static char **get_guest_vif_list(libxl__gc *gc, uint32_t domid,
> +                                 int *num_vifs)
> +{
> +    libxl_device_nic *nics = NULL;
> +    int nb, i = 0;
> +    char **vif_list = NULL;
> +
> +    *num_vifs = 0;
> +    nics = libxl_device_nic_list(CTX, domid, &nb);
> +    if (!nics) return NULL;
> +
> +    /* Ensure that none of the vifs are backed by driver domains */
> +    for (i = 0; i < nb; i++) {
> +        if (nics[i].backend_domid != LIBXL_TOOLSTACK_DOMID) {
> +            LOG(ERROR, "vif %s has driver domain (%u) as its backend.\n"
> +                "Network buffering is not supported with driver domains",

The LOG macros should not normally be passed messages with newlines
in.

> +                get_vifname(gc, domid, &nics[i]), nics[i].backend_domid);
> +            *num_vifs = -1;

This calling convention is strange.  I think get_guest_vif_list should
return a libxl rc value.

> +    vif_list = libxl__calloc(gc, nb, sizeof(char *));
> +    for (i = 0; i < nb; ++i)
> +        vif_list[i] = get_vifname(gc, domid, &nics[i]);

What if get_vifname fails ?



The idea here is that if the interface has a name, then we use it. Otherwise,
we simply revert to the name that Xen assigns it - which is in vifX.Y format.
But I agree with you that the vifname reading process from xenstore may not
be successful always. 
 
...

> +
> +static int init_qdiscs(libxl__gc *gc,
> +                       libxl__remus_ctx *remus_ctx)
> +{
> +    int i, ret, ifindex;
> +    struct rtnl_link *ifb = NULL;
> +    struct rtnl_qdisc *qdisc = NULL;
...
> +    /* convenience vars */

We use the term "Convenience aliases".  It's best to use the exact
same terminology as elsewhere.

> +    libxl__remus_netbuf_ctx *netbuf_ctx = remus_ctx->netbuf_ctx;
> +    int num_netbufs = netbuf_ctx->num_netbufs;
> +    char **ifb_list = netbuf_ctx->ifb_list;

These should all be declared const.

> +    /* list of handles to plug qdiscs */
> +    netbuf_ctx->netbuf_qdisc_list =
> +        libxl__calloc(gc, num_netbufs,
> +                      sizeof(struct rtnl_qdisc *));

Please use GCNEW_ARRAY.

> +        if (qdisc) {
> +            const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
> +            /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
> +            if (!tc_kind || strcmp(tc_kind, "plug")) {
> +                LOG(ERROR, "plug qdisc is not installed on %s", ifb_list[i]);

You seem to assume that all the failures from rtnl_tc_get_kind are the
due to the expected cause.  You need to check errno, I think.

Not according to the documentation. Its return value is either the name of the 
qdisc or NULL if its not set. There is no mention of errno.  
I can only go by whats documented :).


> +    rtnl_link_put(ifb);
> +    return 0;
> +
> + end:

We call these "out", not "end".  This should be changed everywhere.

> +    if (ifb) rtnl_link_put(ifb);
> +    return ERROR_FAIL;

Can't this leak netbuf_ctx->nlsock and netbuf_ctx->qdisc_cache ?


You asked this question before (in a previous version). There is a cleanup
code that gets called that releases these resources (the teardown function).

> +}
> +
> +/* the script needs the following env & args
> + * $vifname
> + * $XENBUS_PATH (/libxl/<domid>/remus/netbuf/<devid>/)
> + * $IFB (for teardown)
> + * setup/teardown as command line arg.
> + * In return, the script writes the name of IFB device (during setup) to be
> + * used for output buffering into XENBUS_PATH/ifb
> + */

This information should be somewhere else, not buried in the libxl
source code.  For now, could it be in the one script which you
actually supply ?


It is also documented in the script.
 

> +static int exec_netbuf_script(libxl__gc *gc, libxl__remus_ctx *remus_ctx,
> +                              char *op, libxl__ev_child_callback *death)
> +{
> +    int arraysize, nr = 0;
> +    char **env = NULL, **args = NULL;
> +    pid_t pid;

This code has many important similarities to the device_hotplug
functions in libxl_device.c.

For example your timeout callback is pretty much identical to
device_hotplug_timeout_cb.

I think you should consider whether you can use libxl__ao_device, and
perhaps share other parts of that code.


I looked at them before deciding to roll out my own.  Please correct me if I am
wrong about this:
The ao_device interface while sharing some similarities, does not really fit in here. 
It attempts bring up/down a device (that has been created already, for e.g., a vif), 
add it to the guest and represent it internally using libxl__device or something.

The netbuf code invokes the script, which in turn sets up an IFB device that is not
part of the guest's configuration.  The information about the device returned by the
script is collected in an internal data structure and new handles (plug qdisc handle)
are created, and added to remus' state.

The control flows are different.  The only commonality is invoke external script, wait
until it finishes and invoke callback.

On alternative is to hack into the device_hotplug code, and add special cases. For e.g.,
if device_ptr is null, then this is a remus op, so do Y instead of X.

This kind of special casing would percolate throughout the device hotplug code,
polluting a what was otherwise cleanly defined interface to handle generic device hotplug.


If you have any other alternatives in mind, I would be happy to check it out.
 

> +    libxl__ev_child_init(child);

This should be done where the remus state is first populated, not
here.


> +    char *script = libxl__strdup(gc, remus_ctx->netbufscript);

These next lot are all convenience aliases AFIACT.  This should be
mentioned, and they should all be const:

> +    libxl__ev_child *child = &remus_ctx->child;
> +    libxl__ev_time *timeout = &remus_ctx->timeout;
> +    uint32_t domid = remus_ctx->dss->domid;
> +    int devid = remus_ctx->num_exec;
> +    libxl__remus_netbuf_ctx *netbuf_ctx = remus_ctx->netbuf_ctx;
> +    char *vif = netbuf_ctx->vif_list[devid];
> +    char *ifb = netbuf_ctx->ifb_list[devid];

So instead, something like:

  +    libxl__ev_child *const child = &remus_ctx->child;
...
  +    const uint32_t domid = remus_ctx->dss->domid;
...
  +    char *const ifb = netbuf_ctx->ifb_list[devid];

There are other places in the code with a similar pattern which also
need to be changed.


> +    if (!strcmp(op, "setup"))
> +        arraysize = 5;
> +    else
> +        arraysize = 7;

Perhaps we could do away with this if() and just always allocate 7 ?

> +    GCNEW_ARRAY(env, arraysize);
> +    env[nr++] = "vifname";
> +    env[nr++] = vif;
> +    env[nr++] = "XENBUS_PATH";
> +    env[nr++] = GCSPRINTF("%s/remus/netbuf/%d",
> +                          libxl__xs_libxl_path(gc, domid), devid);
> +    if (!strcmp(op, "teardown")) {

I think you should pass op as an enum, and convert it to a string
right at the end.  That would do away with these strcmps.

> +    remus_ctx->num_exec++;

num_exec is a misleading variable name here.  In libxl_linux.c and
libxl_device.c it refers to the number of times we have run the
hotplug script _for each device_, not the iteration over the number of
devices.

> +static void netbuf_setup_script_cb(libxl__egc *egc,
> +                                   libxl__ev_child *child,
> +                                   pid_t pid, int status)
> +{
...
> +    hotplug_error = libxl__xs_read(gc, XBT_NULL,
> +                                   GCSPRINTF("%s/hotplug-error",
> +                                             out_path_base));

Again, libxl__xs_read_checked (several times) ?

> +static void netbuf_setup_timeout_cb(libxl__egc *egc,
> +                                    libxl__ev_time *ev,
> +                                    const struct timeval *requested_abs)
> +{

This seems to be yet another copy of device_hotplug_timeout_cb.


Yep. See my answer above. 

> +void libxl__remus_netbuf_setup(libxl__egc *egc,
> +                               libxl__domain_suspend_state *dss)
> +{
> +    libxl__remus_netbuf_ctx *netbuf_ctx = NULL;
> +    uint32_t domid = dss->domid;
> +    libxl__remus_ctx *remus_ctx = dss->remus_ctx;
> +    int num_netbufs = 0;
> +    int rc = ERROR_FAIL;
...
> +    netbuf_ctx = libxl__zalloc(gc, sizeof(libxl__remus_netbuf_ctx));

Please use GCNEW, not libxl__zalloc, when you can.

> +/* Note: This function will be called in the same gc context as
> + * libxl__remus_netbuf_setup, created during the libxl_domain_remus_start
> + * API call.
> + */
> +void libxl__remus_netbuf_teardown(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dss)
> +{
...
> +    remus_ctx->num_exec = 0; //start at devid == 0
> +    if (exec_netbuf_script(gc, remus_ctx, "teardown",
> +                           netbuf_teardown_script_cb))
> +        libxl__remus_teardown_done(egc, dss);

I think the return value from exec_netbuf_script should be put into an
rc variable.  And we don't correctly propagate the rc here AFAICT.

> +#define TC_BUFFER_START 1
> +#define TC_BUFFER_RELEASE 2

Are these #defines not supposed to come from some Linux header file ?
I don't think having them as constants here can be right.


They are not constants taken from any library and stuck in here.  They are merely
for readability.  The related code is remus_netbuffer_op, where you would find
the buffer/release code blocks controlled through the value of op variable
(which is either BUFFER_START or BUFFER_RELEASE)

 thanks
shriram
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.