[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 12/16] vpci: add initial support for virtual PCI bus topology


  • To: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 21 Sep 2023 18:03:04 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=asL+1Ribng+yURLV9h70yjQN7omSjl581XJ/JU+tp4o=; b=SQGqBmb+FAANh9EPcAkcK9OR8e738gPTUPMt3KQsY64TrCec11RPn+anFhH5YApjlw+7d9ip4XVwijrhrtqKmxKOa11eywZCtZHdJT+RPFkQYHUhFy2S4jwG5dqTnS2bIIRkW6/yQW8FS869PJ6FMe4MImOBUi3kJ+N9/K9nQL3ZY/gRbwYngtKnmwWjtV5+k8R5xhkI1TQmXcD4HyOkdSexIVWVOlEZDNlJKm+KNTwzfXaFCptncFuM8AW8j7V3oXBEAwUgHbpG8wwmsWEZop3zT1eB9U/2vgLP5CTNvp9XBZ1snU3/+sPuKO5Kpc+WgENRB2+/kfQIVVU2TVMnxg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Hob1+uYS/X1jnvbhLVvphIjTBYbb1FRmjedmLN6s9OAn7MitmVdMnE9BV2971XAMmFO62t+oIqbQEfqz+dOjMWo+4uFO2UHKJnYGxfAlfvQn5sN5Zs3YH1W9g3Iy+W3GtIrOyWs+bqcJ/7SSOqD94eQNIr0oaeODTGKi4CKVZEK1lpe2QZiKhgc2QUnqgJMa8uuWnGFarfWBGHCXulxBAHQdvyRRsVJwmTc4M1jzN+y5AIJ18IzF6v9te3N+DnCs/Uuu4X+O0AhzOgkMphxISgqQvn1uuB4asGp33DcQ5PAmxwftq9NEIq51s26SRXKzXJPf3ufzixv+vHM3JN9R7w==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Thu, 21 Sep 2023 16:03:50 +0000
  • Ironport-data: A9a23:6qWvnavDmerR7eJmwiBI0px2PefnVPZfMUV32f8akzHdYApBsoF/q tZmKWmAPvuPajOmeItzbIrj8UoE7JLdyYVkT1M6rC01QiIV+JbJXdiXEBz9bniYRiHhoOCLz O1FM4Wdc5pkJpP4jk3wWlQ0hSAkjclkfpKlVaicfHg3HFc4IMsYoUoLs/YjhYJ1isSODQqIu Nfjy+XSI1bg0DNvWo4uw/vrRChH4rKq41v0gnRkPaoQ5A6EziFPZH4iDfrZw0XQE9E88tGSH 44v/JnhlkvF8hEkDM+Sk7qTWiXmlZaLYGBiIlIPM0STqkAqSh4ai87XB9JFAatjsB2bnsgZ9 Tl4ncfYpTHFnEH7sL91vxFwS0mSNEDdkVPNCSDXXce7lyUqf5ZwqhnH4Y5f0YAwo45K7W9yG fMweWBRNzGfts2NkLe6F81125VzKM36BdZK0p1g5Wmx4fcOZ7nmGvyPz/kImTA6i4ZJAOrUY NcfZXx3dhPcbhZTO1ARTpUjgOOvgXq5eDpdwL6XjfNvvy6Pk0osgP60boq9lt+iHK25mm6Co W3L5SLhCwwyP92D0zuVtHmrg4cjmAuiAtxLSeflr6ACbFu7x3I+JiYxThiHh6OjkWe7afAYD WUZ5X97xUQ13AnxJjXnZDWGp3qDsg8ZSsBnOeQw4wGQyYLZ+w+cQGMDS1ZpeNEg8cM7WzEu/ luIhM/yQyxitqWPTnCQ/avSqim9UQAOMWIdbDUYCwsE59Xuqps6iB7nR9NvVqWyi7XdPjX9w CuDqiQksJwVgdQWzKWw/V3BgDWEq4DAS0g+4QC/dmCv4x59ZYWlT5e18lWd5vFFRK6bU12As X4si8WYqucUAvmljzeRSe8AGLWo4fetMzDGh1NrWZ47+FyF5HqLbY1WpjZkKy9U3t0sfDboZ ArZv1NX7ZoKZX+yN/YpM8S2FtggyrXmGZL9TPfIY9FSY593Mgia4CVpYk3W1Gfo+KQxrZwC1 V6gWZ7EJR4n5W5Pl1Jam891PWcX+x0D
  • Ironport-hdrordr: A9a23:ZCGila0XBZabI+dKqrnmhAqjBLwkLtp133Aq2lEZdPU1SKClfq WV98jzuiWatN98Yh8dcLK7WJVoMEm8yXcd2+B4V9qftWLdyQiVxe9ZnO7f6gylNyri9vNMkY dMGpIObOEY1GIK7/rH3A==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Aug 29, 2023 at 11:19:46PM +0000, Volodymyr Babchuk wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> 
> Assign SBDF to the PCI devices being passed through with bus 0.
> The resulting topology is where PCIe devices reside on the bus 0 of the
> root complex itself (embedded endpoints).
> This implementation is limited to 32 devices which are allowed on
> a single PCI bus.
> 
> Please note, that at the moment only function 0 of a multifunction
> device can be passed through.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> ---
> Since v9:
> - Lock in add_virtual_device() replaced with ASSERT (thanks, Stewart)
> Since v8:
> - Added write lock in add_virtual_device
> Since v6:
> - re-work wrt new locking scheme
> - OT: add ASSERT(pcidevs_write_locked()); to add_virtual_device()
> Since v5:
> - s/vpci_add_virtual_device/add_virtual_device and make it static
> - call add_virtual_device from vpci_assign_device and do not use
>   REGISTER_VPCI_INIT machinery
> - add pcidevs_locked ASSERT
> - use DECLARE_BITMAP for vpci_dev_assigned_map
> Since v4:
> - moved and re-worked guest sbdf initializers
> - s/set_bit/__set_bit
> - s/clear_bit/__clear_bit
> - minor comment fix s/Virtual/Guest/
> - added VPCI_MAX_VIRT_DEV constant (PCI_SLOT(~0) + 1) which will be used
>   later for counting the number of MMIO handlers required for a guest
>   (Julien)
> Since v3:
>  - make use of VPCI_INIT
>  - moved all new code to vpci.c which belongs to it
>  - changed open-coded 31 to PCI_SLOT(~0)
>  - added comments and code to reject multifunction devices with
>    functions other than 0
>  - updated comment about vpci_dev_next and made it unsigned int
>  - implement roll back in case of error while assigning/deassigning devices
>  - s/dom%pd/%pd
> Since v2:
>  - remove casts that are (a) malformed and (b) unnecessary
>  - add new line for better readability
>  - remove CONFIG_HAS_VPCI_GUEST_SUPPORT ifdef's as the relevant vPCI
>     functions are now completely gated with this config
>  - gate common code with CONFIG_HAS_VPCI_GUEST_SUPPORT
> New in v2
> ---
>  xen/drivers/vpci/vpci.c | 69 +++++++++++++++++++++++++++++++++++++++++
>  xen/include/xen/sched.h |  8 +++++
>  xen/include/xen/vpci.h  | 11 +++++++
>  3 files changed, 88 insertions(+)
> 
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 412685f41d..b284f95e05 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -36,6 +36,54 @@ extern vpci_register_init_t *const __start_vpci_array[];
>  extern vpci_register_init_t *const __end_vpci_array[];
>  #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>  
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +static int add_virtual_device(struct pci_dev *pdev)
> +{
> +    struct domain *d = pdev->domain;
> +    pci_sbdf_t sbdf = { 0 };
> +    unsigned long new_dev_number;
> +
> +    if ( is_hardware_domain(d) )
> +        return 0;
> +
> +    ASSERT(pcidevs_locked() && rw_is_write_locked(&pdev->domain->pci_lock));


Do you need to check for pcidevs here?  I would think d->pci_lock
would be enough to protect the virtual allocation device bitmap.

> +
> +    /*
> +     * Each PCI bus supports 32 devices/slots at max or up to 256 when
> +     * there are multi-function ones which are not yet supported.
> +     */
> +    if ( pdev->info.is_extfn )

I think you are missing a !pdev->info.is_virtfn, as is_extfn &&
is_virtfn mean the PF it's an extended function, but not the VF we are
trying to passthrough.

> +    {
> +        gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n",
> +                 &pdev->sbdf);
> +        return -EOPNOTSUPP;
> +    }
> +    new_dev_number = find_first_zero_bit(d->vpci_dev_assigned_map,
> +                                         VPCI_MAX_VIRT_DEV);
> +    if ( new_dev_number >= VPCI_MAX_VIRT_DEV )

The > is not required, as find_first_zero_bit() will return
VPCI_MAX_VIRT_DEV if the bitmap is all set.

> +    {
> +        write_unlock(&pdev->domain->pci_lock);
> +        return -ENOSPC;
> +    }
> +
> +    __set_bit(new_dev_number, &d->vpci_dev_assigned_map);
> +
> +    /*
> +     * Both segment and bus number are 0:
> +     *  - we emulate a single host bridge for the guest, e.g. segment 0
> +     *  - with bus 0 the virtual devices are seen as embedded
> +     *    endpoints behind the root complex
> +     *
> +     * TODO: add support for multi-function devices.
> +     */
> +    sbdf.devfn = PCI_DEVFN(new_dev_number, 0);
> +    pdev->vpci->guest_sbdf = sbdf;

You could avoid the local sbdf variable and just use PCI_SBDF(0, 0,
new_dev_number, 0);

> +
> +    return 0;
> +}
> +
> +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
> +
>  void vpci_deassign_device(struct pci_dev *pdev)
>  {
>      unsigned int i;
> @@ -46,6 +94,16 @@ void vpci_deassign_device(struct pci_dev *pdev)
>          return;
>  
>      spin_lock(&pdev->vpci->lock);
> +
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +    if ( pdev->vpci->guest_sbdf.sbdf != ~0 )
> +    {
> +        __clear_bit(pdev->vpci->guest_sbdf.dev,
> +                    &pdev->domain->vpci_dev_assigned_map);
> +        pdev->vpci->guest_sbdf.sbdf = ~0;
> +    }
> +#endif

There's no need to set sbdf = ~0 as vpci is just about to be freed.

> +
>      while ( !list_empty(&pdev->vpci->handlers) )
>      {
>          struct vpci_register *r = list_first_entry(&pdev->vpci->handlers,
> @@ -101,6 +159,13 @@ int vpci_assign_device(struct pci_dev *pdev)
>      INIT_LIST_HEAD(&pdev->vpci->handlers);
>      spin_lock_init(&pdev->vpci->lock);
>  
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +    pdev->vpci->guest_sbdf.sbdf = ~0;
> +    rc = add_virtual_device(pdev);
> +    if ( rc )
> +        goto out;
> +#endif
> +
>      for ( i = 0; i < NUM_VPCI_INIT; i++ )
>      {
>          rc = __start_vpci_array[i](pdev);
> @@ -108,11 +173,15 @@ int vpci_assign_device(struct pci_dev *pdev)
>              break;
>      }
>  
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> + out:
> +#endif

That's ugly, can you use the __maybe_unused attribute with a label?

>      if ( rc )
>          vpci_deassign_device(pdev);
>  
>      return rc;
>  }
> +

Spurious change.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.