[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.


  • To: "Yuji Shimada" <shimada-yxb@xxxxxxxxxxxxxxx>
  • From: "Neo Jia" <neojia@xxxxxxxxx>
  • Date: Wed, 8 Oct 2008 18:04:56 -0700
  • Cc: Keir Fraser <keir.fraser@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 08 Oct 2008 18:05:22 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=Zuzi3sH22MwU3o5UJuFRdfTZOGHSQ9gZAHgBhOjTz3bnilBwfJzmwIW0h3rXw7HDy0 j7BvIaV9/45cDu2IBB+Fnw4+9cuDmz6Q5xvNTMD0lVbuS1o3bK1IEfcz7R0CffBEJcB6 Q+MNYZoIV7iApU3utywlOAaDh3IgcvFuNvkKs=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

So, for a I/O device, if we don't see the error message you mentioned
before, it probably don't need this, right?

Thanks,
Neo

On Wed, Oct 8, 2008 at 4:43 PM, Yuji Shimada
<shimada-yxb@xxxxxxxxxxxxxxx> wrote:
> This patch adds the function that reassign page-aligned memory
> resources, to dom0 linux. The function is useful when we assign I/O
> device to HVM domain using pci passthrough.
>
>
> When we assign a device to HVM domain using pci passthrough,
> the device needs to be assigned page-aligned memory resources. If the
> memory resource is not page-aligned, following error occurs.
>
>    Error: pci: 0000:00:1d.7: non-page-aligned MMIO BAR found.
>
> On many system, BIOS assigns memory resources to the device and
> enables it. So my patch disables the device, and releases resources.
> Then it assigns page-aligned memory resource to the device.
>
> Note: we don't need to re-enable device, because guest software will
> enable it.
>
>
> To reassign resources, please add boot parameters of dom0 linux as follows.
>
>    reassign_resources reassigndev=00:1d.7,01:00.0
>
>        reassign_resources
>                        Enables reassigning resources.
>
>        reassigndev=    Specifies devices include I/O device and PCI-PCI
>                        bridge to reassign resources.  PCI-PCI bridge
>                        can be specified, if resource windows need to
>                        be expanded.
>
> You can easily improve the way of specifying device to reassign,
> changing the code of reassigndev.c.
>
>
> There is a similar function enabled by pci-mem-align boot
> parameter. Currently it is kept. But if many people agree with me, I'd
> like to remove it from dom0 linux, because there are two problems.
>
>    - pci-mem-align reassigns all devices' memory resources if they are
>      not page-aligned. This is not safe, because some devices are
>      used by firmware.
>    - pci-mem-align can't expand resource window of PCI-PCI bridge.
>
> Thanks,
> --
> Yuji Shimada
>
>
> Signed-off-by: Yuji Shimada <shimada-yxb@xxxxxxxxxxxxxxx>
>
> diff -r b54652ee29ef drivers/pci/Makefile
> --- a/drivers/pci/Makefile      Thu Oct 02 11:29:02 2008 +0100
> +++ b/drivers/pci/Makefile      Wed Oct 08 12:12:27 2008 +0900
> @@ -3,7 +3,8 @@
>  #
>
>  obj-y          += access.o bus.o probe.o remove.o pci.o quirks.o \
> -                       pci-driver.o search.o pci-sysfs.o rom.o setup-res.o
> +                       pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \
> +                       reassigndev.o
>  obj-$(CONFIG_PROC_FS) += proc.o
>
>  # Build PCI Express stuff if needed
> diff -r b54652ee29ef drivers/pci/pci.h
> --- a/drivers/pci/pci.h Thu Oct 02 11:29:02 2008 +0100
> +++ b/drivers/pci/pci.h Wed Oct 08 12:12:27 2008 +0900
> @@ -99,3 +99,8 @@
>        return NULL;
>  }
>
> +#define ROUND_UP_TO_PAGESIZE(size) ((size + PAGE_SIZE - 1) & ~(PAGE_SIZE - 
> 1))
> +
> +extern int reassign_resources;
> +extern int is_reassigndev(struct pci_dev *dev);
> +extern void pci_update_bridge(struct pci_dev *dev, int resno);
> diff -r b54652ee29ef drivers/pci/quirks.c
> --- a/drivers/pci/quirks.c      Thu Oct 02 11:29:02 2008 +0100
> +++ b/drivers/pci/quirks.c      Wed Oct 08 12:12:27 2008 +0900
> @@ -33,6 +33,19 @@
>  }
>  __setup("pci-mem-align", set_pci_mem_align);
>
> +
> +int reassign_resources = 0;
> +
> +static int __init set_reassign_resources(char *str)
> +{
> +       /* resources reassign on */
> +       reassign_resources = 1;
> +       printk(KERN_DEBUG "PCI: resource reassign ON.\n");
> +
> +       return 1;
> +}
> +__setup("reassign_resources", set_reassign_resources);
> +
>  /* This quirk function enables us to force all memory resources which are
>  * assigned to PCI devices, to be page-aligned.
>  */
> @@ -41,6 +54,42 @@
>        int i;
>        struct resource *r;
>        resource_size_t old_start;
> +
> +       if (reassign_resources) {
> +               if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL &&
> +                   (dev->class >> 8) == PCI_CLASS_BRIDGE_HOST) {
> +                       /* PCI Host Bridge isn't a target device */
> +                       return;
> +               }
> +               if (is_reassigndev(dev)) {
> +                       printk(KERN_INFO
> +                               "PCI: Disable device and release resources"
> +                               " [%s].\n", pci_name(dev));
> +                       pci_disable_device(dev);
> +
> +                       for (i=0; i < PCI_NUM_RESOURCES; i++) {
> +                               r = &dev->resource[i];
> +                               if ((r == NULL) ||
> +                                  !(r->flags & IORESOURCE_MEM))
> +                                       continue;
> +
> +                               r->end = r->end - r->start;
> +                               r->start = 0;
> +
> +                               if (i < PCI_BRIDGE_RESOURCES) {
> +                                       pci_update_resource(dev, r, i);
> +                               } else if (i == 8 || i == 9) {
> +                                       /* need to update(clear) the 
> Base/Limit
> +                                        * register also, because PCI bridge 
> is
> +                                        * disabled and the resource is
> +                                        * released.
> +                                        */
> +                                       pci_update_bridge(dev, i);
> +                               }
> +                       }
> +               }
> +               return;
> +       }
>
>        if (!pci_mem_align)
>                return;
> diff -r b54652ee29ef drivers/pci/reassigndev.c
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/drivers/pci/reassigndev.c Wed Oct 08 12:12:27 2008 +0900
> @@ -0,0 +1,81 @@
> +/*
> + * Copyright (c) 2008, NEC Corporation.
> + *      Taro Nichiden <t-nichiden at ab jp nec com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 
> Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/pci.h>
> +#include <linux/string.h>
> +#include "pci.h"
> +
> +
> +#define        REASSIGNDEV_PARAM_MAX   (2048)
> +#define        TOKEN_MAX       (12)    /* "SSSS:BB:DD.F" length is 12 */
> +
> +static char param_reassigndev[REASSIGNDEV_PARAM_MAX] = {0};
> +
> +static int __init reassigndev_setup(char *str)
> +{
> +       strncpy(param_reassigndev, str, REASSIGNDEV_PARAM_MAX);
> +       param_reassigndev[REASSIGNDEV_PARAM_MAX - 1] = '\0';
> +       return 1;
> +}
> +__setup("reassigndev=", reassigndev_setup);
> +
> +int is_reassigndev(struct pci_dev *dev)
> +{
> +       char dev_str[TOKEN_MAX+1];
> +       int seg, bus, slot, func;
> +       int len;
> +       char *p, *next_str;
> +
> +       p = param_reassigndev;
> +       for (; p; p = next_str + 1) {
> +               next_str = strpbrk(p, ",");
> +               if (next_str) {
> +                       len = next_str - p;
> +               } else {
> +                       len = strlen(p);
> +               }
> +               if (len > 0 && len <= TOKEN_MAX) {
> +                       strncpy(dev_str, p, len);
> +                       *(dev_str + len) = '\0';
> +
> +                       if (sscanf(dev_str, "%x:%x:%x.%x",
> +                               &seg, &bus, &slot, &func) != 4) {
> +                               if (sscanf(dev_str, "%x:%x.%x",
> +                                       &bus, &slot, &func) == 3) {
> +                                       seg = 0;
> +                               } else {
> +                                       /* failed to scan strings */
> +                                       seg = -1;
> +                                       bus = -1;
> +                               }
> +                       }
> +                       if (seg == pci_domain_nr(dev->bus) &&
> +                           bus == dev->bus->number &&
> +                           slot == PCI_SLOT(dev->devfn) &&
> +                           func == PCI_FUNC(dev->devfn)) {
> +                               /* It's a target device */
> +                               return 1;
> +                       }
> +               }
> +               if (!next_str)
> +                       break;
> +       }
> +
> +       return 0;
> +}
> diff -r b54652ee29ef drivers/pci/setup-bus.c
> --- a/drivers/pci/setup-bus.c   Thu Oct 02 11:29:02 2008 +0100
> +++ b/drivers/pci/setup-bus.c   Wed Oct 08 12:12:27 2008 +0900
> @@ -26,6 +26,7 @@
>  #include <linux/cache.h>
>  #include <linux/slab.h>
>
> +#include "pci.h"
>
>  #define DEBUG_CONFIG 1
>  #if DEBUG_CONFIG
> @@ -344,7 +345,8 @@
>
>        list_for_each_entry(dev, &bus->devices, bus_list) {
>                int i;
> -
> +               int reassign = reassign_resources ? is_reassigndev(dev) : 0;
> +
>                for (i = 0; i < PCI_NUM_RESOURCES; i++) {
>                        struct resource *r = &dev->resource[i];
>                        unsigned long r_size;
> @@ -352,6 +354,11 @@
>                        if (r->parent || (r->flags & mask) != type)
>                                continue;
>                        r_size = r->end - r->start + 1;
> +
> +                       if (reassign) {
> +                               r_size = ROUND_UP_TO_PAGESIZE(r_size);
> +                       }
> +
>                        /* For bridges size != alignment */
>                        align = (i < PCI_BRIDGE_RESOURCES) ? r_size : r->start;
>                        order = __ffs(align) - 20;
> diff -r b54652ee29ef drivers/pci/setup-res.c
> --- a/drivers/pci/setup-res.c   Thu Oct 02 11:29:02 2008 +0100
> +++ b/drivers/pci/setup-res.c   Wed Oct 08 12:12:27 2008 +0900
> @@ -117,19 +117,96 @@
>  }
>  EXPORT_SYMBOL_GPL(pci_claim_resource);
>
> +void
> +pci_update_bridge(struct pci_dev *dev, int resno)
> +{
> +       struct resource *res = &dev->resource[resno];
> +       struct pci_bus_region region;
> +       u32 l, dw, base_up32, limit_up32;
> +
> +       if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE ||
> +           (dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) {
> +               return;
> +       }
> +
> +       if (!res->flags)
> +               return;
> +
> +       switch (resno) {
> +       case 8 :        /* MMIO Base/Limit */
> +               pcibios_resource_to_bus(dev, &region, res);
> +               if (res->flags & IORESOURCE_MEM &&
> +                   !(res->flags & IORESOURCE_PREFETCH)) {
> +                       l = (region.start >> 16) & 0xfff0;
> +                       l |= region.end & 0xfff00000;
> +               } else {
> +                       l = 0x0000fff0;
> +               }
> +               pci_write_config_dword(dev, PCI_MEMORY_BASE, l);
> +
> +               break;
> +
> +       case 9 :        /* Prefetchable MMIO Base/Limit */
> +               /* Clear out the upper 32 bits of PREF limit.
> +                * If PCI_PREF_BASE_UPPER32 was non-zero, this temporarily
> +                * disables PREF range, which is ok.
> +                */
> +               pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, 0);
> +
> +               /* Get PREF 32/64 bits Addressing mode */
> +               pci_read_config_dword(dev, PCI_PREF_MEMORY_BASE, &dw);
> +
> +               pcibios_resource_to_bus(dev, &region, res);
> +               if (res->flags & IORESOURCE_MEM &&
> +                   res->flags & IORESOURCE_PREFETCH) {
> +                       l = (region.start >> 16) & 0xfff0;
> +                       l |= region.end & 0xfff00000;
> +
> +                       if (dw & PCI_PREF_RANGE_TYPE_64) {
> +                               base_up32 = (region.start >> 32) & 0xffffffff;
> +                               limit_up32 = (region.end >> 32) & 0xffffffff;
> +                       } else {
> +                               base_up32 = 0;
> +                               limit_up32 = 0;
> +                       }
> +               } else {
> +                       l = 0x0000fff0;
> +                       base_up32 = 0xffffffff;
> +                       limit_up32 = 0;
> +               }
> +               pci_write_config_dword(dev, PCI_PREF_MEMORY_BASE, l);
> +               /* Set up the upper 32 bits of PREF base/limit. */
> +               pci_write_config_dword(dev, PCI_PREF_BASE_UPPER32, base_up32);
> +               pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, 
> limit_up32);
> +               break;
> +       default :
> +               BUG();
> +               break;
> +       }
> +}
> +
>  int pci_assign_resource(struct pci_dev *dev, int resno)
>  {
>        struct pci_bus *bus = dev->bus;
>        struct resource *res = dev->resource + resno;
>        resource_size_t size, min, align;
>        int ret;
> +       int reassigndev = reassign_resources ? is_reassigndev(dev) : 0;
>
>        size = res->end - res->start + 1;
>        min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM;
>        /* The bridge resources are special, as their
>           size != alignment. Sizing routines return
>           required alignment in the "start" field. */
> -       align = (resno < PCI_BRIDGE_RESOURCES) ? size : res->start;
> +       if (resno < PCI_BRIDGE_RESOURCES) {
> +               align = size;
> +               if ((reassigndev) &&
> +                   (res->flags & IORESOURCE_MEM)) {
> +                       align = ROUND_UP_TO_PAGESIZE(align);
> +               }
> +       } else {
> +               align = res->start;
> +       }
>
>        /* First, try exact prefetching match.. */
>        ret = pci_bus_alloc_resource(bus, res, size, align, min,
> @@ -154,6 +231,9 @@
>                        resno, (unsigned long long)size,
>                        (unsigned long long)res->start, pci_name(dev));
>        } else if (resno < PCI_BRIDGE_RESOURCES) {
> +               printk(KERN_DEBUG "PCI: Assign resource(%d) on %s "
> +                       "%016llx - %016llx\n", resno, pci_name(dev),
> +                       (u64)res->start, (u64)res->end);
>                pci_update_resource(dev, res, resno);
>        }
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>



-- 
I would remember that if researchers were not ambitious
probably today we haven't the technology we are using!

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.