Xen project Mailing List

Re: [Xen-devel] [PATCH v13 4/7] remus netbuffer: implement remus network buffering for nic devices

Yang Hongyang writes ("[PATCH v13 4/7] remus netbuffer: implement remus network buffering for nic devices"): > 1.Add two members in libxl_domain_remus_info: Thanks. I'm going to make mostly general comments on this. > /* > + * LIBXL_HAVE_REMUS_NETBUF 1 > + * > + * If this is defined, then the libxl_domain_remus_info structure will > + * have a boolean field (netbuf) and a string field (netbufscript). > + * > + * netbuf, if true, indicates that network buffering should be enabled. > + * > + * netbufscript, if set, indicates the path to the hotplug script to > + * setup or teardown network buffers. > + */ > +#define LIBXL_HAVE_REMUS_NETBUF 1 This seems to imply that the default at the API level is for network buffering to be turned off. I don't think that's right, is it ? > +/* If the device has a vifname, then use that instead of > + * the vifX.Y format. I don't think it is right in general to look in the backend directory for the vifname. If we are using driver domains, the backend xenstore directory is writeable by the driver domain. But I think Remus is not at all compatible with driver domains right now - am I right ? If so then at the very least this function should have something in the comment saying that it must ONLY be used for remus because if driver domains were in use it would constitute a security vulnerability. > + ret = nl_cache_refill(netbuf_state->nlsock, netbuf_state->qdisc_cache); > + if (ret < 0) { > + LOG(ERROR, "cannot refill qdisc cache"); Do these functions return error codes or set errno or something ? I think you should print the actual error that was encountered. That might be as simple as using LOGE instead but it depends on what they do. > + rc = libxl__xs_read_checked(gc, XBT_NULL, > + GCSPRINTF("%s/hotplug-error", out_path_base), > + &hotplug_error); > + if (rc) { > + rc = ERROR_FAIL; I think you can just "goto out" without setting rc again. > +static void netbuf_teardown_script_cb(libxl__egc *egc, > + libxl__async_exec_state *aes, > + int status) As I said in my other mail, can you reorganise this so that the functions are the file in the same order as the control flow, with appropriate comments dividing the different flows up ? > +static void async_call_done(libxl__egc *egc, > + libxl__ev_child *child, > + pid_t pid, int status) > +{ > + libxl__remus_device *dev = CONTAINER_OF(child, *dev, child); > + libxl__remus_device_state *rds = dev->rds; > + STATE_AO_GC(rds->ao); > + > + if (WIFEXITED(status)) { > + dev->callback(egc, dev, -WEXITSTATUS(status)); > + } else { > + dev->callback(egc, dev, ERROR_FAIL); > + } This is really quite odd. These nonzero statuses are surely always errors ? Why leave it to the next layer to interpret ? Returning either a positive ERROR_* or a -WEXITSTATUS is particularly odd. And if you are going to do this you need to make a crystal-clear doc comment about it. > +static void nic_match_async(const libxl__remus_device_ops *self, > + libxl__remus_device *dev) > +{ > + if (dev->kind == LIBXL__REMUS_DEVICE_NIC) > + _exit(0); > + > + _exit(-ERROR_NOT_MATCH); > +} > + > +static void nic_match(libxl__remus_device_ops *self, > + libxl__remus_device *dev) > +{ > + int pid = -1; > + STATE_AO_GC(dev->rds->ao); > + > + /* Fork and call */ > + pid = libxl__ev_child_fork(gc, &dev->child, async_call_done); > + if (pid == -1) { > + LOG(ERROR, "unable to fork"); > + goto out; > + } > + > + if (!pid) { > + /* child */ > + nic_match_async(self, dev); > + /* notreached */ > + abort(); This is very odd. Why are you spawning a subprocess to test dev->kind ? > +static void nic_setup(libxl__remus_device *dev) > +{ > + libxl__remus_device_nic *remus_nic; > + libxl__remus_netbuf_state *ns = dev->ops->data; > + const libxl_device_nic *nic = dev->backend_dev; > + > + STATE_AO_GC(ns->ao); > + > + GCNEW(remus_nic); > + dev->data = remus_nic; > + remus_nic->vif = get_vifname(dev, nic); > + if (!remus_nic->vif) > + goto out; I think you should set rc here rather than (implicitly) in out. > + setup_async_exec(&dev->aes, "setup", dev); > + if (libxl__async_exec_start(gc, &dev->aes)) { And you should propagate the rc from libxl__async_exec_start. > +/* > + * Note: This function will be called in the same gc context as > + * libxl__remus_netbuf_setup, created during the libxl_domain_remus_start > + * API call. > + */ > +static void nic_teardown(libxl__remus_device *dev) This comment is interesting but I don't understand the implications. Also isn't this a feature of the abstract/concrete API which should be documented there ? > +static void nic_match_async(const libxl__remus_device_ops *self, > + libxl__remus_device *dev) > +{ > + if (dev->kind == LIBXL__REMUS_DEVICE_NIC) > + _exit(-ERROR_FAIL); > + > + _exit(-ERROR_NOT_MATCH); Perhaps each libxl__remus_device_ops could mention the kind it supports, so that the abstract code could check that ? That would avoid duplicating this check in every concrete kind's code. > +extern libxl__remus_device_ops remus_device_nic; This needs to be const. I'm afraid I'm going to leave the drbd patch for another day... Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.