[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [PATCH 6 of 6] xl: implement multifunction device PCI pass-through
On Wed, 2010-08-04 at 16:47 +0100, Stefano Stabellini wrote: > On Wed, 4 Aug 2010, Gianni Tedesco (3P) wrote: > > On Wed, 2010-08-04 at 15:52 +0100, Stefano Stabellini wrote: > > > On Mon, 2 Aug 2010, Gianni Tedesco (3P) wrote: > > > > tools/libxl/libxl.h | 1 + > > > > tools/libxl/libxl_pci.c | 118 > > > > ++++++++++++++++++++++++++++++++++++++++++++--- > > > > 2 files changed, 111 insertions(+), 8 deletions(-) > > > > > > > > > > > > # HG changeset patch > > > > # User Gianni Tedesco <gianni.tedesco@xxxxxxxxxx> > > > > # Date 1280760698 -3600 > > > > # Branch pci-patches-v3 > > > > # Node ID 6379343447e58cc7b83e27beba9c35e62c232d9d > > > > # Parent eb8ee481bc063b18b39feaa76d9b757331ed3a9f > > > > xl: implement multifunction device PCI pass-through > > > > > > > > Implement PCI pass-through for multi-function devices. The supported BDF > > > > notation is: BB:DD.* - therefore passing-through a subset of functions > > > > or > > > > remapping the function numbers is not supported. Largely because I > > > > don't see > > > > how that can ever be safe or sane (perhaps for SR-IOV cards?) > > > > > > > > Letting qemu automatically select the virtual slot is not supported > > > > since I > > > > don't believe that qemu-dm can actually handle that case. > > > > > > > > Signed-off-by: Gianni Tedesco <gianni.tedesco@xxxxxxxxxx> > > > > > > > > diff -r eb8ee481bc06 -r 6379343447e5 tools/libxl/libxl.h > > > > --- a/tools/libxl/libxl.h Mon Aug 02 15:47:38 2010 +0100 > > > > +++ b/tools/libxl/libxl.h Mon Aug 02 15:51:38 2010 +0100 > > > > @@ -308,6 +308,7 @@ typedef struct { > > > > unsigned int vdevfn; > > > > bool msitranslate; > > > > bool power_mgmt; > > > > + bool multifunction; > > > > } libxl_device_pci; > > > > > > > > enum { > > > > diff -r eb8ee481bc06 -r 6379343447e5 tools/libxl/libxl_pci.c > > > > --- a/tools/libxl/libxl_pci.c Mon Aug 02 15:47:38 2010 +0100 > > > > +++ b/tools/libxl/libxl_pci.c Mon Aug 02 15:51:38 2010 +0100 > > > > @@ -136,8 +136,13 @@ int libxl_device_pci_parse_bdf(libxl_ctx > > > > break; > > > > } > > > > *ptr = '\0'; > > > > - if ( hex_convert(tok, &func, 0x7) ) > > > > - goto parse_error; > > > > + if ( !strcmp(tok, "*") ) { > > > > + pcidev->multifunction = 1; > > > > + }else{ > > > > + if ( hex_convert(tok, &func, 0x7) ) > > > > + goto parse_error; > > > > + pcidev->multifunction = 0; > > > > + } > > > > tok = ptr + 1; > > > > } > > > > break; > > > > > > Couldn't you add the func_mask somewhere in libxl_device_pci, instead of > > > calling pci_multifunction_check multilple times? > > > Would be possible to make the normal passthrough case just a subset of > > > the general multifunction passthrough case to simplify the code? > > > > Not sure what you mean by this, the parsing parts don't (and oughtn't) > > frob about with sysfs nodes etc. The pci_multifunction_check function is > > called once per device add and once per device remove iff the parser had > > seen XX:YY.* notation indicating multifunction devices. > > > > I mean we could have a mask instead of an interger in libxl_device_pci > to specify which functions should be assigned and in the single function > case the mask will have only one bit set. > At this point single function passthrough become just a case of > multifunction passthrough. > I am not entirely sure that this approach would produce better code > though. I see what you mean. I could set it to #define PCI_FUNCTION_EVERYTHING ~0 at parse-time then in multifunction_check unset all the non-existent functions. > > > > > > > > @@ -531,6 +536,50 @@ int libxl_device_pci_list_assignable(lib > > > > return 0; > > > > } > > > > > > > > +static int pci_multifunction_check(libxl_ctx *ctx, libxl_device_pci > > > > *pcidev, unsigned int *func_mask) > > > > +{ > > > > > > a comment on top of this function to explain what it is actually checking > > > would be nice > > > > This function checks that all functions of a device are bound to pciback > > driver. It also initialises a bit-mask of which function numbers are > > present on that device. > > > > I shall add the comment in next rev. > > > > > > + struct dirent *de; > > > > + DIR *dir; > > > > + > > > > + *func_mask = 0; > > > > + > > > > + dir = opendir(SYSFS_PCI_DEV); > > > > + if ( NULL == dir ) { > > > > + XL_LOG_ERRNO(ctx, XL_LOG_ERROR, "Couldn't open %s", > > > > SYSFS_PCI_DEV); > > > > + return -1; > > > > + } > > > > + > > > > + while( (de = readdir(dir)) ) { > > > > + unsigned dom, bus, dev, func; > > > > + struct stat st; > > > > + char *path; > > > > + > > > > + if ( sscanf(de->d_name, PCI_BDF, &dom, &bus, &dev, &func) != 4 > > > > ) > > > > + continue; > > > > + if ( pcidev->domain != dom ) > > > > + continue; > > > > + if ( pcidev->bus != bus ) > > > > + continue; > > > > + if ( pcidev->dev != dev ) > > > > + continue; > > > > + > > > > + path = libxl_sprintf(ctx, "%s/" PCI_BDF, SYSFS_PCIBACK_DRIVER, > > > > dom, bus, dev, func); > > > > + if ( lstat(path, &st) ) { > > > > + if ( errno == ENOENT ) > > > > + XL_LOG(ctx, XL_LOG_ERROR, PCI_BDF " is not assigned to > > > > pciback driver", > > > > + dom, bus, dev, func); > > > > + else > > > > + XL_LOG_ERRNO(ctx, XL_LOG_ERROR, "Couldn't lstat %s", > > > > path); > > > > + closedir(dir); > > > > + return -1; > > > > + } > > > > + (*func_mask) |= (1 << func); > > > > + } > > > > + > > > > + closedir(dir); > > > > + return 0; > > > > +} > > > > + > > > > static int pci_ins_check(libxl_ctx *ctx, uint32_t domid, const char > > > > *state, void *priv) > > > > { > > > > char *orig_state = priv; > > > > @@ -679,10 +728,32 @@ int libxl_device_pci_add(libxl_ctx *ctx, > > > > return rc; > > > > } > > > > > > > > + if ( pcidev->multifunction ) { > > > > + unsigned int i, orig_vdev, func_mask, rc = 0; > > > > + orig_vdev = pcidev->vdevfn & ~7U; > > > > + > > > > + if ( pci_multifunction_check(ctx, pcidev, &func_mask) ) > > > > + return ERROR_FAIL; > > > > + if ( !(pcidev->vdevfn >> 3) ) { > > > > + XL_LOG(ctx, XL_LOG_ERROR, "Must specify a v-slot for > > > > multi-function devices"); > > > > + return ERROR_FAIL; > > > > + } > > > > + > > > > + for(i = 7; i < 8; --i) { > > > > > > shouldn't this be for(i = 7; i >= 0; --i)? > > > > That will generate a compiler warning since unsigned int's are always >= > > 0. The code relies on an underflow condition. However, I could change it > > to int if you think that makes it clearer. > > > > yeah please do that OK > > > > + if ( (1 << i) & func_mask ) { > > > > + pcidev->func = i; > > > > + pcidev->vdevfn = orig_vdev | i; > > > > + if ( do_pci_add(ctx, domid, pcidev) ) > > > > + rc = ERROR_FAIL; > > > > + } > > > > + } > > > > + return rc; > > > > + } > > > > + > > > > return do_pci_add(ctx, domid, pcidev); > > > > } > > > > > > > > -int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, > > > > libxl_device_pci *pcidev) > > > > +static int do_pci_remove(libxl_ctx *ctx, uint32_t domid, > > > > libxl_device_pci *pcidev) > > > > { > > > > libxl_device_pci *assigned; > > > > char *path; > > > > @@ -711,10 +782,15 @@ int libxl_device_pci_remove(libxl_ctx *c > > > > libxl_xs_write(ctx, XBT_NULL, path, PCI_BDF, pcidev->domain, > > > > pcidev->bus, pcidev->dev, pcidev->func); > > > > path = libxl_sprintf(ctx, > > > > "/local/domain/0/device-model/%d/command", domid); > > > > - xs_write(ctx->xsh, XBT_NULL, path, "pci-rem", > > > > strlen("pci-rem")); > > > > - if (libxl_wait_for_device_model(ctx, domid, "pci-removed", > > > > NULL, NULL) < 0) { > > > > - XL_LOG(ctx, XL_LOG_ERROR, "Device Model didn't respond in > > > > time"); > > > > - return ERROR_FAIL; > > > > + > > > > + /* Remove all functions at once atomically by only signalling > > > > + * device-model for function 0 */ > > > > + if ( !pcidev->multifunction || pcidev->func == 0 ) { > > > > + xs_write(ctx->xsh, XBT_NULL, path, "pci-rem", > > > > strlen("pci-rem")); > > > > + if (libxl_wait_for_device_model(ctx, domid, "pci-removed", > > > > NULL, NULL) < 0) { > > > > + XL_LOG(ctx, XL_LOG_ERROR, "Device Model didn't respond > > > > in time"); > > > > + return ERROR_FAIL; > > > > + } > > > > } > > > > path = libxl_sprintf(ctx, > > > > "/local/domain/0/device-model/%d/state", domid); > > > > xs_write(ctx->xsh, XBT_NULL, path, state, strlen(state)); > > > > @@ -769,7 +845,10 @@ skip1: > > > > fclose(f); > > > > } > > > > out: > > > > - libxl_device_pci_reset(ctx, pcidev->domain, pcidev->bus, > > > > pcidev->dev, pcidev->func); > > > > + /* don't do multiple resets while some functions are still passed > > > > through */ > > > > + if ( !pcidev->multifunction || pcidev->func == 0 ) { > > > > + libxl_device_pci_reset(ctx, pcidev->domain, pcidev->bus, > > > > pcidev->dev, pcidev->func); > > > > + } > > > > > > > > if (!libxl_is_stubdom(ctx, domid, NULL)) { > > > > rc = xc_deassign_device(ctx->xch, domid, pcidev->value); > > > > @@ -786,6 +865,29 @@ out: > > > > return 0; > > > > } > > > > > > > > +int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, > > > > libxl_device_pci *pcidev) > > > > +{ > > > > + if ( pcidev->multifunction ) { > > > > + unsigned int i, orig_vdev, func_mask, rc; > > > > + orig_vdev = pcidev->vdevfn & ~7U; > > > > + > > > > + if ( pci_multifunction_check(ctx, pcidev, &func_mask) ) > > > > + return ERROR_FAIL; > > > > + > > > > + for(i = 7; i < 8; --i) { > > > > > > same here > > > > > > > + if ( (1 << i) & func_mask ) { > > > > + pcidev->func = i; > > > > + pcidev->vdevfn = orig_vdev | i; > > > > + if ( do_pci_remove(ctx, domid, pcidev) ) > > > > + rc = ERROR_FAIL; > > > > + } > > > > + } > > > > + return rc; > > > > + } > > > > + > > > > + return do_pci_remove(ctx, domid, pcidev); > > > > +} > > > > + > > > > int libxl_device_pci_list_assigned(libxl_ctx *ctx, libxl_device_pci > > > > **list, uint32_t domid, int *num) > > > > { > > > > char *be_path, *num_devs, *xsdev, *xsvdevfn, *xsopts; > > > > > > > > > > > > > applied all the other patches in the series > > > > OK thanks. > > > > I will also need to handle cleanup in the error path of multifunction > > device addition as discussed. This can actually happen quite easily if > > function 0 alone is assigned to another domain because > > pci_multifunction_check() doesn't look in xenstore and also because of > > missing safety checks. Specifically, when adding a non-multifunction > > device we need a check to make sure that the physical device really has > > only one function. > > > > I am thinking that we probably ought to scrap the XX:YY.* notation and > > enforce that function is ALWAYS set to 0. Then when adding/removing the > > device we should check for additional functions and handle them > > transparently to the user. As I said in the commit message I am unaware > > of any valid uses for separating functions but I need to look at how > > SR-IOV works. > > > > > > With SR-IOV devices you can easily have valid BDFs like 0000:05:1c.2, so > enforcing that function is always 0 is not a good idea. I will have to look in to that and try come up something that is safe in all such cases. For now I will just re-work the patch along the lines you mentioned. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |