[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] libbxl: add support for pvscsi, iteration 1
On Fri, May 02, Ian Campbell wrote: > > The backend driver uses a single_host:many_devices notation to manage > > domU devices. The xenstore layout looks like this: > > Please can you add ~/device/vscsi/$DEVID/* and > ~/backend/vif/$DOMID/$DEVID/* to docs/misc/xenstore-paths.markdown. > > If you were inclined to improve xen/include/public/io/vscsiif.h by using > some of the content from here then that would be awesome too. I think its a good idea to document what the current backend/frontend code does, as this will also aid with review of the libxl changes. > > This is a challenge for libxl because libxl currently handles only > > single_host:single_device in its device helper functions. > > This is a bit like pci I suppose? Maybe. But today Ondrej found a way to attach additional devices, detaching them will most likely work as well. Basically one creates the new vscsi-devs/dev-N/ subdir with state==1, then write state==7 to the backend. Similar with detach, state==5 to the dev-N, then state==7 to the backend. > > Due to this > > limitation in libxl, scsi-detach will remove the whole virtual scsi host > > with all its virtual devices, instead of just the requested single > > virtual device. > > That's rather unfortunate! > > Are multiple <vhost> allowed/supported? e.g. could I (or better, libxl > by default) setup a device-per-host situation to alleviate this? > > For pcifront/back we support bus reconfiguration (via > XenbusStateReconfigur{ing,ed}) which is how this works there. Does vscsi > not support anything like that? Many vhosts are supported, I just did not realize that reconfigure was used here as well. So maybe libxl does not need further changes. I will hack some code to show how attach/detach works in theory. > > To support the pdev notation '/dev/$device' it is required to parse > > sysfs to translate the path to HST:CHN:TGT:LUN notation. In its current > > variant /sys/dev/ is used, which is available since at least Linux 3.0. > > Support for dom0 kernels which lack this interface will be added later. > > Are there any such kernels which anyone cares about any more? I have to check when /sys/dev was introduced. In the end it would be nice to support also older dom0 kernels. > > - xl scsi-detach will remove entire scsi hosts, unlike its xm counter part. > > What did xm do instead? Fail? No, it will use the xenstore reconfigure state to indicate a change. > > Note that the latter format is unreliable because > > +the HOST value can change across dom0 reboots. > > /dev/sgN would be unreliable too I think? As might block devices not > using the /dev/disk/by-* links you give in the example. You might be > best saying that only /dev/disk/by-* block device references are > reliable because all of the others might change over a reboot. I will reword this part. One customer reported that his SCSI device was not working with xend because this device was accessible only via a char device. And I think udev was able to create some by-id path for the sgN node. Have to dig into that old report. > > + > > +=item C<vdev> > > + > > +Specifies how the SCSI device is mapped into the guest. The notation is in > > +SCSI notation HOST:CHANNEL:TARGET:LUN. HOST in this case means a virthal > > Typo "virtual" > > > +SCSI host within the guest. > > All of HOST:CHANNEL:TARGET:LUN are numbers I think, are they > hex/dec/oct? Are there limits? I have to find out, its most likely all decimal. > > +Right now only one option is recognized: feature-host. > What does it do? I have to find out ;-) Last time I checked the command was passed as is from domU to dom0, but I dont really know as it is not documented. > > +=item B<scsi-attach> I<domain-id> I<pdev> I<vdev> I<[,feature-host]> > > Is the comma in I<[,feature-host]> literally required, as in > xl scsi-attach mydom 0:0:0:0 0:0:0:0 ,feature-host > ? This is a typo I noticed after submission. Not sure how to specify the option. Either a '0:0:0:0,feature-host' or as two strings. > > +Removes the vscsi device from domain specified by I<domain-id>. > > +Note that the whole virtual SCSI host with all its devices is removed. > > +This is a BUG! > > I have a feeling it would be preferable in the first instance to either > just not provide the detach portion of this interface, or to require > that vdev is only a host and not a specific device (... unless the host > only has a single device, but maybe that's getting too far) Attach/detach can probably be implemented without much work. > > +++ b/tools/libxl/libxl.c > > + > > +void libxl__device_vscsi_add(libxl__egc *egc, uint32_t domid, > > + libxl_device_vscsi *vscsi, > > + libxl__ao_device *aodev) > > +{ > > + STATE_AO_GC(aodev->ao); > > + flexarray_t *front; > > + flexarray_t *back; > > + libxl__device *device; > > + unsigned int rc, i; > > + > > + i = 2 * (4 + (3 * vscsi->num_vscsi_devs)); > > That's quite exciting ;-) > > A flexarray will grow as neccessary so some sort of static best guest of > a starting point would be fine here I think (or a comment about 2 4 and > 3)! Yes, that was just an attempt to avoid realloc. As usual, optimization of non-critical code paths... > I suppose the num_vscsi_devs here is an artefact of the short coming you > mentioned before. The way PCI handles this is to create the "bus" (aka > vhost here) either when the first device is attached or in the higher > level code before it calls the individual device adds (i.e. this > function), or something (I must confess my memory is a bit fuzzy, > libxl__create_pci_backend might be a place to look). Having done that > then libxl_device_pci is just a single device. > > I think you should look to do something similar here, so that each > libxl_device_vscsi is actually a single device. The alternative is some > sort of libxl_bus_vscsi thing, which would be unconventional compared > with everything else The thing with host:dev[,dev[,dev]] is that within the guest something may rely on the way SCSI devices are presented (ordering, naming, no idea). Surely the vscsi= and scsi-attach/detach parsing code could do some sort of translation from a group of devices-per-vhost into a single vhost:dev. But now that we have found that reconfiguration is used in xenstore to attach/detach devices it may be possible to reuse existing libxl helpers while preserving existing domU.cfg files and backend drivers. > > +libxl_device_vscsi *libxl_device_vscsi_list(libxl_ctx *ctx, uint32_t > > domid, int *num) > > +{ > > + GC_INIT(ctx); > > + > > + libxl_device_vscsi *vscsi_hosts = NULL; > > + char *fe_path; > > + char **dir; > > + unsigned int ndirs = 0; > > + > > + fe_path = libxl__sprintf(gc, "%s/device/vscsi", > > libxl__xs_get_dompath(gc, domid)); > > Need to take care trusting this too much... Not sure what you mean with this sentence? > > @@ -3520,6 +3713,10 @@ DEFINE_DEVICE_REMOVE(vfb, destroy, 1) > > DEFINE_DEVICE_REMOVE(vtpm, remove, 0) > > DEFINE_DEVICE_REMOVE(vtpm, destroy, 1) > > > > +/* vscsi */ > > +DEFINE_DEVICE_REMOVE(vscsi, remove, 0) > > +DEFINE_DEVICE_REMOVE(vscsi, destroy, 0) > > Should the second one not be (vscsi, destroy, 1) ? Right. > > +libxl_device_vscsi = Struct("device_vscsi", [ > > + ("backend_domid", libxl_domid), > > + ("devid", libxl_devid), > > + ("v_hst", uint32), > > + ("vscsi_devs", Array(libxl_vscsi_dev, "num_vscsi_devs")), > > + ("feature_host", bool), > > Not defbool (I don't know what this does...) This is just flag for "feature-host" yes/no. Have to find out what this backend option actually does. > > @@ -508,6 +528,28 @@ libxl_vtpminfo = Struct("vtpminfo", [ > > ("uuid", libxl_uuid), > > ], dir=DIR_OUT) > > > > +libxl_vscsiinfo = Struct("vscsiinfo", [ > > + ("backend", string), > > + ("backend_id", uint32), > > + ("frontend", string), > > + ("frontend_id", uint32), > > + ("devid", libxl_devid), > > + ("p_hst", uint16), > > + ("p_chn", uint16), > > + ("p_tgt", uint16), > > + ("p_lun", uint16), > > Type, also the widths here differ from the device... > > > + ("vscsi_dev_id", libxl_devid), > > + ("v_hst", uint16), > > + ("v_chn", uint16), > > + ("v_tgt", uint16), > > + ("v_lun", uint16), > > and again Some format code was unhappy with u16, cant remember. I was just trying to save space. Again, optimization of non-critical code paths. > > +static char *vscsi_trim_string(char *s) > > +{ > > + unsigned int len; > > + > > + while (isspace(*s)) > > + s++; > > + len = strlen(s); > > + while (len-- > 1 && isspace(s[len])) > > + s[len] = '\0'; > > + return s; > > Doesn't seem to be anything vscsi specific here (and it seems like a > generally useful thing. I expect at least thevif parsing would benefit > from deploying this too (in so much as all this wouldn't be better with > a proper parser using flex). I figured that could be a generic helper. Have to check existing code if it can make use of it. > > +} > > + > > +static void parse_vscsi_config(libxl_device_vscsi *vscsi_host, > > + libxl_vscsi_dev *vscsi_dev, > > + char *buf) > > +{ > > + char *pdev, *vdev, *fhost; > > + unsigned int hst, chn, tgt, lun; > > + > > + libxl_device_vscsi_init(vscsi_host); > > + pdev = strtok(buf, ","); > > + vdev = strtok(NULL, ","); > > + fhost = strtok(NULL, ","); > > I guess we are single threaded here so this is ok. > > I wonder if this might be useful to put in libxlu alongside the disk > parser? The question is whether anything else wants to parse xm style > vscsi strings, like libvirt perhaps. (arguably a bunch of these parsers > here have that property and belong in libxlu). I was wondering how libvirt support for vscsi would look like, looks like it has to reimplement all that parsing to fill struct libxl_domain_config. Maybe that parser could be generic, depends how external callers deal with such parsing today. > > + pdev = vscsi_trim_string(pdev); > > + vdev = vscsi_trim_string(vdev); > > + > > + if (strncmp(pdev, "/dev/", 5) == 0) { > > +#ifdef __linux__ > > Urk. Normally we do this by splitting by files, e.g. into a linux > version and a nop version and compiling the appropriate one. > > This seems like a candidate for libxlu too. We probably need to have a > think about how much of this stuff is done in libxl too -- e.g. for > disks we pass the pdev in as a string (path) and interpret it there. > That allows us to choose between alternative backend providers etc. Also > libxl is a slightly more palatable place to deal with Linux vs non-Linux > bits. I think that might be the right answer here too. There is a libxl_linux.c file, but thats for libxl not xl. So we went with the ifdef for the time being. > > @@ -1242,6 +1360,56 @@ static void parse_config_data(const char > > *config_source, > > } > > } > > > > + if (!xlu_cfg_get_list(config, "vscsi", &vscsis, 0, 0)) { > > + int cnt_vscsi_devs = 0; > > + d_config->num_vscsis = 0; > > + d_config->vscsis = NULL; > > + while ((buf = xlu_cfg_get_listitem (vscsis, cnt_vscsi_devs)) != > > NULL) { > > + libxl_vscsi_dev vscsi_dev = { }; > > + libxl_device_vscsi vscsi_host = { }; > > + libxl_device_vscsi *host; > > + char *tmp_buf; > > + int num_vscsis, host_found = 0; > > + > > + /* > > + * #1: parse the devspec and place it in temporary host+dev > > part > > + * #2: find existing vscsi_host with number v_hst > > + * if found, append the vscsi_dev to this vscsi_host > > + * #3: otherwise, create new vscsi_host and append vscsi_dev > > + * Note: v_hst does not represent the index named "num_vscsis", > > + * it is a private index used just in the config file > > Would all be a lot simpler if a libxl_device_vscsi was a single device > and all the worrying about which bus was which was in libxl As said above, domU.cfg code like this may serve a purpose. Devices are grouped per vhost (0: and 1: in this example): ### vscsi = [ ### '/dev/sdm, 0:0:0:0', ### '/dev/sdn, 0:0:0:1', ### '/dev/sdo, 0:0:1:0', ### '/dev/sdp, 0:2:0:1 ', ### '/dev/sdq, 1:0:0:0, feature-host ', ### '/dev/sdr, 1:1:1:0, feature-host', ### ] Thanks for the quick review! Olaf _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |