[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [DOM0 KERNELS] pciback: Fix SR-IOV VF passthrough



>>> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> 26.02.10 18:25 >>>
>Vendor/device and BAR fields in a VF's host-level PCI config space are dummy
>and must always be virtualised/emulated. Implement this in pciback by always
>extracting the values installed in dom0 kernel's own PCI structures, rather
>than interrogating the underlying PCI config space directly.
>
>AFAIK, this patch should apply to any kernel that implements pciback: That
>includes pv_ops, SLES, and the XS/XCP kernels. It should be applied to all
>of them. It is already applied to linux-2.6.18-xen.hg as 998:693c40564c8d.
>
>Signed-off-by: Keir Fraser <keir.fraser@xxxxxxxxxx>

Some parts of this we had been given by Intel, but some were also
implemented differently there. I'm reproducing the patch below, and
I would appreciate clarification on the differences in the bar_read()/
bar_write()/rom_write() vs. read_dev_bar() modifications.

In any case I would think that the command_write() change would
be generally applicable.

Jan

Subject: guest SR-IOV support for PV guest

These changes are for PV guest to use Virtual Function. Because the VF's
vendor, device registers in cfg space are 0xffff, which are invalid and
ignored by PCI device scan. Values in 'struct pci_dev' are fixed up by
SR-IOV code, and using these values will present correct VID and DID to
PV guest kernel.

And command registers in the cfg space are read only 0, which means we
have to emulate MMIO enable bit (VF only uses MMIO  resource) so PV
kernel can work properly.

--- head-2009-07-28.orig/drivers/xen/pciback/conf_space_header.c        
2009-07-28 12:01:32.000000000 +0200
+++ head-2009-07-28/drivers/xen/pciback/conf_space_header.c     2009-07-29 
11:03:07.000000000 +0200
@@ -18,6 +18,25 @@ struct pci_bar_info {
 #define is_enable_cmd(value) ((value)&(PCI_COMMAND_MEMORY|PCI_COMMAND_IO))
 #define is_master_cmd(value) ((value)&PCI_COMMAND_MASTER)
 
+static int command_read(struct pci_dev *dev, int offset, u16 *value, void 
*data)
+{
+       int i;
+       int ret;
+
+       ret = pciback_read_config_word(dev, offset, value, data);
+       if (!atomic_read(&dev->enable_cnt))
+               return ret;
+
+       for (i = 0; i < PCI_ROM_RESOURCE; i++) {
+               if (dev->resource[i].flags & IORESOURCE_IO)
+                       *value |= PCI_COMMAND_IO;
+               if (dev->resource[i].flags & IORESOURCE_MEM)
+                       *value |= PCI_COMMAND_MEMORY;
+       }
+
+       return ret;
+}
+
 static int command_write(struct pci_dev *dev, int offset, u16 value, void 
*data)
 {
        int err;
@@ -141,10 +160,26 @@ static inline void read_dev_bar(struct p
                                struct pci_bar_info *bar_info, int offset,
                                u32 len_mask)
 {
-       pci_read_config_dword(dev, offset, &bar_info->val);
-       pci_write_config_dword(dev, offset, len_mask);
-       pci_read_config_dword(dev, offset, &bar_info->len_val);
-       pci_write_config_dword(dev, offset, bar_info->val);
+       int     pos;
+       struct resource *res = dev->resource;
+
+       if (offset == PCI_ROM_ADDRESS || offset == PCI_ROM_ADDRESS1)
+               pos = PCI_ROM_RESOURCE;
+       else {
+               pos = (offset - PCI_BASE_ADDRESS_0) / 4;
+               if (pos && ((res[pos - 1].flags & (PCI_BASE_ADDRESS_SPACE |
+                               PCI_BASE_ADDRESS_MEM_TYPE_MASK)) ==
+                          (PCI_BASE_ADDRESS_SPACE_MEMORY |
+                               PCI_BASE_ADDRESS_MEM_TYPE_64))) {
+                       bar_info->val = res[pos - 1].start >> 32;
+                       bar_info->len_val = res[pos - 1].end >> 32;
+                       return;
+               }
+       }
+
+       bar_info->val = res[pos].start |
+                       (res[pos].flags & PCI_REGION_FLAG_MASK);
+       bar_info->len_val = res[pos].end - res[pos].start + 1;
 }
 
 static void *bar_init(struct pci_dev *dev, int offset)
@@ -185,6 +220,22 @@ static void bar_release(struct pci_dev *
        kfree(data);
 }
 
+static int pciback_read_vendor(struct pci_dev *dev, int offset,
+                              u16 *value, void *data)
+{
+       *value = dev->vendor;
+
+       return 0;
+}
+
+static int pciback_read_device(struct pci_dev *dev, int offset,
+                              u16 *value, void *data)
+{
+       *value = dev->device;
+
+       return 0;
+}
+
 static int interrupt_read(struct pci_dev *dev, int offset, u8 * value,
                          void *data)
 {
@@ -212,9 +263,19 @@ static int bist_write(struct pci_dev *de
 
 static const struct config_field header_common[] = {
        {
+        .offset    = PCI_VENDOR_ID,
+        .size      = 2,
+        .u.w.read  = pciback_read_vendor,
+       },
+       {
+        .offset    = PCI_DEVICE_ID,
+        .size      = 2,
+        .u.w.read  = pciback_read_device,
+       },
+       {
         .offset    = PCI_COMMAND,
         .size      = 2,
-        .u.w.read  = pciback_read_config_word,
+        .u.w.read  = command_read,
         .u.w.write = command_write,
        },
        {



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.