[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libbxl: add support for pvscsi, iteration 1



On Fri, 2014-05-02 at 17:54 +0200, Olaf Hering wrote:
> On Fri, May 02, Ian Campbell wrote:
> 
> > > The backend driver uses a single_host:many_devices notation to manage
> > > domU devices. The xenstore layout looks like this:
> > 
> > Please can you add ~/device/vscsi/$DEVID/* and
> > ~/backend/vif/$DOMID/$DEVID/* to docs/misc/xenstore-paths.markdown.
> > 
> > If you were inclined to improve xen/include/public/io/vscsiif.h by using
> > some of the content from here then that would be awesome too.
> 
> I think its a good idea to document what the current backend/frontend
> code does, as this will also aid with review of the libxl changes.
> 
> > > This is a challenge for libxl because libxl currently handles only
> > > single_host:single_device in its device helper functions.
> > 
> > This is a bit like pci I suppose?
> 
> Maybe. But today Ondrej found a way to attach additional devices,
> detaching them will most likely work as well. Basically one creates the
> new vscsi-devs/dev-N/ subdir with state==1, then write state==7 to the
> backend. Similar with detach, state==5 to the dev-N, then state==7 to
> the backend.

This sounds remarkably similar to how the pci stuff works. Glad there is
a least some sort of consistency.

> 
> > > Due to this
> > > limitation in libxl, scsi-detach will remove the whole virtual scsi host
> > > with all its virtual devices, instead of just the requested single
> > > virtual device.
> > 
> > That's rather unfortunate!
> > 
> > Are multiple <vhost> allowed/supported? e.g. could I (or better, libxl
> > by default) setup a device-per-host situation to alleviate this?
> > 
> > For pcifront/back we support bus reconfiguration (via
> > XenbusStateReconfigur{ing,ed}) which is how this works there. Does vscsi
> > not support anything like that?
> 
> Many vhosts are supported, I just did not realize that reconfigure was
> used here as well. So maybe libxl does not need further changes. I will
> hack some code to show how attach/detach works in theory.

Brilliant!

> > > To support the pdev notation '/dev/$device' it is required to parse
> > > sysfs to translate the path to HST:CHN:TGT:LUN notation. In its current
> > > variant /sys/dev/ is used, which is available since at least Linux 3.0.
> > > Support for dom0 kernels which lack this interface will be added later.
> > 
> > Are there any such kernels which anyone cares about any more?
> 
> I have to check when /sys/dev was introduced. In the end it would be
> nice to support also older dom0 kernels.

Certainly it would be nice and it's up to you how nice you are
feeling ;-)

> 
> > >  - xl scsi-detach will remove entire scsi hosts, unlike its xm counter 
> > > part.
> > 
> > What did xm do instead? Fail?
> 
> No, it will use the xenstore reconfigure state to indicate a change.

Lets do that ;-)


> > > +Right now only one option is recognized: feature-host.
> > What does it do?
> 
> I have to find out ;-) 

:-D

> Last time I checked the command was passed as is from domU to dom0, but
> I dont really know as it is not documented.

You have my sympathies.

> > > +=item B<scsi-attach> I<domain-id> I<pdev> I<vdev> I<[,feature-host]>
> > 
> > Is the comma in I<[,feature-host]> literally required, as in
> >     xl scsi-attach mydom 0:0:0:0 0:0:0:0 ,feature-host
> > ?
> 
> This is a typo I noticed after submission.
> Not sure how to specify the option. Either a '0:0:0:0,feature-host' or
> as two strings.

The right answer probably depends on the meaning of feature host and how
closely bound to the concept of a vdev it is, but by default I'd say a
feature ought to be separate from the vdev.

> > I suppose the num_vscsi_devs here is an artefact of the short coming you
> > mentioned before. The way PCI handles this is to create the "bus" (aka
> > vhost here) either when the first device is attached or in the higher
> > level code before it calls the individual device adds (i.e. this
> > function), or something (I must confess my memory is a bit fuzzy,
> > libxl__create_pci_backend might be a place to look). Having done that
> > then libxl_device_pci is just a single device.
> > 
> > I think you should look to do something similar here, so that each
> > libxl_device_vscsi is actually a single device. The alternative is some
> > sort of libxl_bus_vscsi thing, which would be unconventional compared
> > with everything else
> 
> The thing with host:dev[,dev[,dev]] is that within the guest something
> may rely on the way SCSI devices are presented (ordering, naming, no
> idea). Surely the vscsi= and scsi-attach/detach parsing code could do
> some sort of translation from a group of devices-per-vhost into a single
> vhost:dev.

Sorry I didn't mean to suggest not obeying the users state prefrence for
host:dev etc.

>  But now that we have found that reconfiguration is used in
> xenstore to attach/detach devices it may be possible to reuse existing
> libxl helpers while preserving existing domU.cfg files and backend
> drivers.

Great.


> > > +libxl_device_vscsi *libxl_device_vscsi_list(libxl_ctx *ctx, uint32_t 
> > > domid, int *num)
> > > +{
> > > +    GC_INIT(ctx);
> > > +
> > > +    libxl_device_vscsi *vscsi_hosts = NULL;
> > > +    char *fe_path;
> > > +    char **dir;
> > > +    unsigned int ndirs = 0;
> > > +
> > > +    fe_path = libxl__sprintf(gc, "%s/device/vscsi", 
> > > libxl__xs_get_dompath(gc, domid));
> > 
> > Need to take care trusting this too much...
> 
> Not sure what you mean with this sentence?

This is a frontend path, a malicious guest might play games with the
content to try and get the toolstack to do something (e.g. redirecting
the backend path to a directory which the guest controls instead of the
toolstack).

> > > +}
> > > +
> > > +static void parse_vscsi_config(libxl_device_vscsi *vscsi_host,
> > > +                              libxl_vscsi_dev *vscsi_dev,
> > > +                              char *buf)
> > > +{
> > > +    char *pdev, *vdev, *fhost;
> > > +    unsigned int hst, chn, tgt, lun;
> > > +
> > > +    libxl_device_vscsi_init(vscsi_host);
> > > +    pdev = strtok(buf, ",");
> > > +    vdev = strtok(NULL, ",");
> > > +    fhost = strtok(NULL, ",");
> > 
> > I guess we are single threaded here so this is ok.
> > 
> > I wonder if this might be useful to put in libxlu alongside the disk
> > parser? The question is whether anything else wants to parse xm style
> > vscsi strings, like libvirt perhaps. (arguably a bunch of these parsers
> > here have that property and belong in libxlu).
> 
> I was wondering how libvirt support for vscsi would look like, looks
> like it has to reimplement all that parsing to fill struct
> libxl_domain_config. Maybe that parser could be generic, depends how
> external callers deal with such parsing today.

FWIW the reason for libxlu_disk was precisely libvirt, although I don't
know if it actually uses it or not. 

> 
> > > +    pdev = vscsi_trim_string(pdev);
> > > +    vdev = vscsi_trim_string(vdev);
> > > +
> > > +    if (strncmp(pdev, "/dev/", 5) == 0) {
> > > +#ifdef __linux__
> > 
> > Urk. Normally we do this by splitting by files, e.g. into a linux
> > version and a nop version and compiling the appropriate one.
> > 
> > This seems like a candidate for libxlu too. We probably need to have a
> > think about how much of this stuff is done in libxl too -- e.g. for
> > disks we pass the pdev in as a string (path) and interpret it there.
> > That allows us to choose between alternative backend providers etc. Also
> > libxl is a slightly more palatable place to deal with Linux vs non-Linux
> > bits. I think that might be the right answer here too.
> 
> There is a libxl_linux.c file, but thats for libxl not xl. So we went
> with the ifdef for the time being.

I think the right answer is likely to be to put this parsing code, or at
least the translation code, into libxl anyhow.


> > > @@ -1242,6 +1360,56 @@ static void parse_config_data(const char 
> > > *config_source,
> > >          }
> > >      }
> > >  
> > > +    if (!xlu_cfg_get_list(config, "vscsi", &vscsis, 0, 0)) {
> > > +        int cnt_vscsi_devs = 0;
> > > +        d_config->num_vscsis = 0;
> > > +        d_config->vscsis = NULL;
> > > +        while ((buf = xlu_cfg_get_listitem (vscsis, cnt_vscsi_devs)) != 
> > > NULL) {
> > > +            libxl_vscsi_dev vscsi_dev = { };
> > > +            libxl_device_vscsi vscsi_host = { };
> > > +            libxl_device_vscsi *host;
> > > +            char *tmp_buf;
> > > +            int num_vscsis, host_found = 0;
> > > +
> > > +            /*
> > > +             * #1: parse the devspec and place it in temporary host+dev 
> > > part
> > > +             * #2: find existing vscsi_host with number v_hst
> > > +             *     if found, append the vscsi_dev to this vscsi_host
> > > +             * #3: otherwise, create new vscsi_host and append vscsi_dev
> > > +             * Note: v_hst does not represent the index named 
> > > "num_vscsis",
> > > +             *       it is a private index used just in the config file
> > 
> > Would all be a lot simpler if a libxl_device_vscsi was a single device
> > and all the worrying about which bus was which was in libxl
> 
> As said above, domU.cfg code like this may serve a purpose. Devices are
> grouped per vhost (0: and 1: in this example):

That's fine isn't it? You would create a list of 6 libxl_device_vscsi's,
and four of them would have host == 0 and two of them have host == 1.
libxl should be written to cope with that, by creating the host on
demand when the first device demands it, or whatever.

> ### vscsi = [
> ###     '/dev/sdm, 0:0:0:0',
> ###     '/dev/sdn, 0:0:0:1',
> ###     '/dev/sdo, 0:0:1:0',
> ###     '/dev/sdp, 0:2:0:1 ',
> ###     '/dev/sdq, 1:0:0:0, feature-host  ',
> ###     '/dev/sdr, 1:1:1:0, feature-host',
> ### ]
> 
> 
> Thanks for the quick review!
> 
> Olaf



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.