[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 06/16] vpci/header: implement guest BAR register handlers


  • To: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 20 Sep 2023 11:49:31 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dDDowD9H8w+aIqHSiMv4jxBQmSjDbOAEQIF92F2dLNY=; b=itP+LejvDFxweDYy6FDUsl54CWRAFWmflvjUIHh+bRh3H52BfAGQJuc7kNDApsRC3B2C29+LypruhfjNy8tdBQPuh9dZ2b/GT4/Gn33H3DYvaPL2Hcuuv0+4pY5wHsT0PxnnK/ZiMyDt23vtms1d2V0/w9WskWgpr3Ey4EVCcDGFd5ksIj5RhEkGy5U5vJdj4eb1VSXV1CK+jplfKIsv/3RszT2DD9R7mEmdO08JX4hlgSz5dFIFxybJIDBbpkPdkyeXObwrnNzLQUqtXmJ4SbUK3kcXfGY8aJJg3a+hwuoYPML5ziLiQEXrkt+7VtHlu+0pxtvx0CkaO2m/t+bwvQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oWF5ADe25Hmcm3zdcwmsPw14OXMV0075JBk7wcZMYWTXWtYVcqJQIhUNKwgzDTcYeRhMM/2GWoduq16tZkD0ANexmPTdPRGfOI1UNWFu3tMqdVHkU/bsfg/+74CvrLSAlZG0WvSu+kxT0bk7ee31rVpC8MPt4Ecz4sGOT6FDQ7eKw1bAt3smta1GED19r1DdBz96D6Qa0hWbyirTfRzgnZ5MLZ3dCV/SygCY0OJejetMJbz+Jd/bG712vi/GIzHIsMQjHIFwUTRYKUlpEmWN11rVPzGrgxEW5Z+PzMmc37ZTCO57fxSBpRMyhoOoAEhuHqEWkCVGj37IWXdzRkrzgA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Delivery-date: Wed, 20 Sep 2023 09:49:59 +0000
  • Ironport-data: A9a23:Tu2pZ6tI6hewDKEwus2R4KFxg+fnVI1fMUV32f8akzHdYApBsoF/q tZmKT3UPf7YamLwfthzbIXgoEIFv8fTm9BrTAA9rH83FiIb+JbJXdiXEBz9bniYRiHhoOCLz O1FM4Wdc5pkJpP4jk3wWlQ0hSAkjclkfpKlVaicfHg3HFc4IMsYoUoLs/YjhYJ1isSODQqIu Nfjy+XSI1bg0DNvWo4uw/vrRChH4rKq4lv0gnRkPaoQ5A6EzyFMZH4iDfrZw0XQE9E88tGSH 44v/JnhlkvF8hEkDM+Sk7qTWiXmlZaLYGBiIlIPM0STqkAqSh4ai87XB9JFAatjsB2bnsgZ9 Tl4ncfYpTHFnEH7sL91vxFwS0mSNEDdkVPNCSDXXce7lyUqf5ZwqhnH4Y5f0YAwo45K7W9yG fMwdhARUw2It+ePxb+/RcNsvt4OIuvGBdZK0p1g5Wmx4fcOZ7nmGv+PyfoGmTA6i4ZJAOrUY NcfZXx3dhPcbhZTO1ARTpUjgOOvgXq5eDpdwL6XjfNvvy6Pk0osgf60b4K9lt+iHK25mm6Co W3L5SLhCwwyP92D0zuVtHmrg4cjmAuiAttCSObpqqACbFu7ynxIESc4d3SBucKCsWG4XvNSN nIYw397xUQ13AnxJjXnZDWGp3qDsg8ZSsBnOeQw4wGQyYLZ+w+cQGMDS1ZpeNEg8cM7WzEu/ luIhM/yQyxitqWPTnCQ/avSqim9UQAOMWIdbDUYCwsE59Xuqps6iB7nR9NvVqWyi7XdPjX9w CuDqiQksJwVgdQWzKWw/V3BgDWEq4DAS0g+4QC/Y46+xgZwZYrga4n271HetKxENNzAEQHHu 2UYkc+D6uxIFYuKiCGGXOQKGveu+uqBNzrfx1VoGvHN6giQxpJqRqgIiBkWGaujGp9slePBC KMLhT5s2Q==
  • Ironport-hdrordr: A9a23:0dXVVqpSgVAKEvz08jk60tYaV5tlLNV00zEX/kB9WHVpm5Oj+f xGzc516farslossSkb6K290KnpewK4yXcH2/hsAV7CZnithILMFutfBOTZskTd8kHFh41gPO JbAtJD4b7LfBVHZKTBkXGF+r8bqbHtncHJuQ6d9QYXcegAUdAF0+4NMHf8LqQAfnggOXNWLu v/2uN34x6bPVgHZMWyAXcIG8LZocfQqZ7gaRkaQzY69Qinl1qTmfHHOind+i1bfyJEwL8k/2 SAuRf+/L+fv/ayzQKZ/3PP7q5RhMDqxrJ4dYKxY4kuW3TRYzSTFcdcso65zXIISSaUmRMXee z30lcd1gJImjfsly+O0FzQMkLboUkTAjfZuCGlaD3Y0IDEbQN/MtFGg41BdBvf9g4PgPFQuZ g7hl6xht5vFhXHkz3659/UEzdQtmTxj0YDvIco/jpiua13Us4LkWXaxjIMLL4QWC3984wpC+ 9oEYXV4+tXa0qTazTDsnBo28HEZAV7Iv4oeDlxhiW56UkgoJlC9Tpv+OUP2nMbsJ4tQZhN4O rJdqxuibFVV8cTKaZwHv0IT8e7AnHEBUukChPYHX33UKUcf37doZ/+57s4oOmsZZwT1ZM33J DMSklRu2I+c1/nTceOwJpI+BbQR3jVZ0Wk9uhOo5xi/rHsTrviNiOODFgojsu7uv0aRtbWXv 6iUagmd8ML7VGebLqh8zeOKKW6c0NuIfH9kuxLK26zng==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Aug 29, 2023 at 11:19:43PM +0000, Volodymyr Babchuk wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> 
> Add relevant vpci register handlers when assigning PCI device to a domain
> and remove those when de-assigning. This allows having different
> handlers for different domains, e.g. hwdom and other guests.
> 
> Emulate guest BAR register values: this allows creating a guest view
> of the registers and emulates size and properties probe as it is done
> during PCI device enumeration by the guest.
> 
> All empty, IO and ROM BARs for guests are emulated by returning 0 on
> reads and ignoring writes: this BARs are special with this respect as
> their lower bits have special meaning, so returning default ~0 on read
> may confuse guest OS.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> ---
> Since v9:
> - factored-out "fail" label introduction in init_bars()
> - replaced #ifdef CONFIG_X86 with IS_ENABLED()
> - do not pass bars[i] to empty_bar_read() handler
> - store guest's BAR address instead of guests BAR register view
> Since v6:
> - unify the writing of the PCI_COMMAND register on the
>   error path into a label
> - do not introduce bar_ignore_access helper and open code
> - s/guest_bar_ignore_read/empty_bar_read
> - update error message in guest_bar_write
> - only setup empty_bar_read for IO if !x86
> Since v5:
> - make sure that the guest set address has the same page offset
>   as the physical address on the host
> - remove guest_rom_{read|write} as those just implement the default
>   behaviour of the registers not being handled
> - adjusted comment for struct vpci.addr field
> - add guest handlers for BARs which are not handled and will otherwise
>   return ~0 on read and ignore writes. The BARs are special with this
>   respect as their lower bits have special meaning, so returning ~0
>   doesn't seem to be right
> Since v4:
> - updated commit message
> - s/guest_addr/guest_reg
> Since v3:
> - squashed two patches: dynamic add/remove handlers and guest BAR
>   handler implementation
> - fix guest BAR read of the high part of a 64bit BAR (Roger)
> - add error handling to vpci_assign_device
> - s/dom%pd/%pd
> - blank line before return
> Since v2:
> - remove unneeded ifdefs for CONFIG_HAS_VPCI_GUEST_SUPPORT as more code
>   has been eliminated from being built on x86
> Since v1:
>  - constify struct pci_dev where possible
>  - do not open code is_system_domain()
>  - simplify some code3. simplify
>  - use gdprintk + error code instead of gprintk
>  - gate vpci_bar_{add|remove}_handlers with CONFIG_HAS_VPCI_GUEST_SUPPORT,
>    so these do not get compiled for x86
>  - removed unneeded is_system_domain check
>  - re-work guest read/write to be much simpler and do more work on write
>    than read which is expected to be called more frequently
>  - removed one too obvious comment
> ---
>  xen/drivers/vpci/header.c | 131 +++++++++++++++++++++++++++++++++-----
>  xen/include/xen/vpci.h    |   3 +
>  2 files changed, 118 insertions(+), 16 deletions(-)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index e58bbdf68d..e96d7b2b37 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -477,6 +477,72 @@ static void cf_check bar_write(
>      pci_conf_write32(pdev->sbdf, reg, val);
>  }
>  
> +static void cf_check guest_bar_write(const struct pci_dev *pdev,
> +                                     unsigned int reg, uint32_t val, void 
> *data)
> +{
> +    struct vpci_bar *bar = data;
> +    bool hi = false;
> +    uint64_t guest_addr = bar->guest_addr;
> +
> +    if ( bar->type == VPCI_BAR_MEM64_HI )
> +    {
> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
> +        bar--;
> +        hi = true;
> +    }
> +    else
> +    {
> +        val &= PCI_BASE_ADDRESS_MEM_MASK;
> +    }
> +
> +    guest_addr &= ~(0xffffffffull << (hi ? 32 : 0));
> +    guest_addr |= (uint64_t)val << (hi ? 32 : 0);
> +
> +    guest_addr &= ~(bar->size - 1) | ~PCI_BASE_ADDRESS_MEM_MASK;

I don't think you need to mask out PCI_BASE_ADDRESS_MEM_MASK here, as
you already do it if bar->type != VPCI_BAR_MEM64_HI for val.

> +
> +    /*
> +     * Make sure that the guest set address has the same page offset
> +     * as the physical address on the host or otherwise things won't work as
> +     * expected.
> +     */
> +    if ( (guest_addr & (~PAGE_MASK)) != (bar->addr & ~PAGE_MASK) )

PAGE_OFFSET() would be easier to read.

> +    {
> +        gprintk(XENLOG_WARNING,
> +                "%pp: ignored BAR %zu write attempting to change page 
> offset\n",
> +                &pdev->sbdf, bar - pdev->vpci->header.bars + hi);
> +        return;
> +    }
> +
> +    bar->guest_addr = guest_addr;
> +}
> +
> +static uint32_t cf_check guest_bar_read(const struct pci_dev *pdev,
> +                                        unsigned int reg, void *data)
> +{
> +    const struct vpci_bar *bar = data;
> +    uint32_t reg_val;
> +
> +    if ( bar->type == VPCI_BAR_MEM64_HI )
> +    {
> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
> +        bar--;
> +        return bar->guest_addr >> 32;
> +    }
> +
> +    reg_val = bar->guest_addr;
> +    reg_val |= bar->type == VPCI_BAR_MEM32 ? PCI_BASE_ADDRESS_MEM_TYPE_32 :
> +                                             PCI_BASE_ADDRESS_MEM_TYPE_64;
> +    reg_val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0;
> +
> +    return reg_val;
> +}
> +
> +static uint32_t cf_check empty_bar_read(const struct pci_dev *pdev,
> +                                        unsigned int reg, void *data)
> +{
> +    return 0;
> +}

If we are going to gain a lot of helpers that return a fixed value it
might be worthwhile to introduce a helper that returns what gets
passed as 'data'.  Let's leave it as you propose for now.

> +
>  static void cf_check rom_write(
>      const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
>  {
> @@ -537,6 +603,7 @@ static int cf_check init_bars(struct pci_dev *pdev)
>      struct vpci_header *header = &pdev->vpci->header;
>      struct vpci_bar *bars = header->bars;
>      int rc;
> +    bool is_hwdom = is_hardware_domain(pdev->domain);
>  
>      ASSERT(rw_is_locked(&pdev->domain->pci_lock));
>  
> @@ -578,8 +645,10 @@ static int cf_check init_bars(struct pci_dev *pdev)
>          if ( i && bars[i - 1].type == VPCI_BAR_MEM64_LO )
>          {
>              bars[i].type = VPCI_BAR_MEM64_HI;
> -            rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, 
> reg,
> -                                   4, &bars[i]);
> +            rc = vpci_add_register(pdev->vpci,
> +                                   is_hwdom ? vpci_hw_read32 : 
> guest_bar_read,
> +                                   is_hwdom ? bar_write : guest_bar_write,
> +                                   reg, 4, &bars[i]);
>              if ( rc )
>                  goto fail;
>              continue;
> @@ -589,6 +658,15 @@ static int cf_check init_bars(struct pci_dev *pdev)
>          if ( (val & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO )
>          {
>              bars[i].type = VPCI_BAR_IO;
> +
> +            if ( !IS_ENABLED(CONFIG_X86) && !is_hwdom )
> +            {
> +                rc = vpci_add_register(pdev->vpci, empty_bar_read, NULL,
> +                                       reg, 4, NULL);
> +                if ( rc )
> +                    goto fail;

For consistency you should also set bars[i].type = VPCI_BAR_EMPTY
here.

> +            }
> +
>              continue;
>          }
>          if ( (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
> @@ -605,6 +683,15 @@ static int cf_check init_bars(struct pci_dev *pdev)
>          if ( size == 0 )
>          {
>              bars[i].type = VPCI_BAR_EMPTY;
> +
> +            if ( !is_hwdom )
> +            {
> +                rc = vpci_add_register(pdev->vpci, empty_bar_read, NULL,
> +                                       reg, 4, NULL);
> +                if ( rc )
> +                    goto fail;
> +            }
> +
>              continue;
>          }
>  
> @@ -612,28 +699,40 @@ static int cf_check init_bars(struct pci_dev *pdev)
>          bars[i].size = size;
>          bars[i].prefetchable = val & PCI_BASE_ADDRESS_MEM_PREFETCH;
>  
> -        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, reg, 4,
> -                               &bars[i]);
> +        rc = vpci_add_register(pdev->vpci,
> +                               is_hwdom ? vpci_hw_read32 : guest_bar_read,
> +                               is_hwdom ? bar_write : guest_bar_write,
> +                               reg, 4, &bars[i]);
>          if ( rc )
>              goto fail;
>      }
>  
> -    /* Check expansion ROM. */
> -    rc = pci_size_mem_bar(pdev->sbdf, rom_reg, &addr, &size, PCI_BAR_ROM);
> -    if ( rc > 0 && size )
> +    /* TODO: Check expansion ROM, we do not handle ROM for guests for now. */
> +    if ( is_hwdom )
>      {
> -        struct vpci_bar *rom = &header->bars[num_bars];
> +        rc = pci_size_mem_bar(pdev->sbdf, rom_reg, &addr, &size, 
> PCI_BAR_ROM);
> +        if ( rc > 0 && size )
> +        {
> +            struct vpci_bar *rom = &header->bars[num_bars];
>  
> -        rom->type = VPCI_BAR_ROM;
> -        rom->size = size;
> -        rom->addr = addr;
> -        header->rom_enabled = pci_conf_read32(pdev->sbdf, rom_reg) &
> -                              PCI_ROM_ADDRESS_ENABLE;
> +            rom->type = VPCI_BAR_ROM;
> +            rom->size = size;
> +            rom->addr = addr;
> +            header->rom_enabled = pci_conf_read32(pdev->sbdf, rom_reg) &
> +                                  PCI_ROM_ADDRESS_ENABLE;
>  
> -        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rom_write, 
> rom_reg,
> -                               4, rom);
> +            rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rom_write,
> +                                   rom_reg, 4, rom);
> +            if ( rc )
> +                rom->type = VPCI_BAR_EMPTY;
> +        }
> +    }
> +    else
> +    {
> +        rc = vpci_add_register(pdev->vpci, empty_bar_read, NULL,
> +                               rom_reg, 4, NULL);

You should set the BAR type to VPCI_BAR_EMPTY here.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.