[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 06/13] vpci/header: implement guest BAR register handlers


  • To: Oleksandr Andrushchenko <andr2000@xxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 8 Feb 2022 10:25:50 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=I1uydbG08brsvWdTFBdPcdk9rw5bGcVGZh9iSwCT3n8=; b=LxhUBEMEBxivQHRqJXvaiYrmCNTQMX0g6cJGolxu/QqVr2/RXsqm6ve1WCelZwufm5WUYQdib/nypHD+KvCXji/48AOpHxhpRqr129kv65qNhGhHJxEurFDOwRFYws/CjmlOw1igNYUwaUwS32yZaInyYY9oSVrsOVpcWotBW3UHTU5thMmptX/82uplhlm/3p8uyUet+thoZlNUTtZnJ7TPGdGubvz4VVs+eMILvPLjSIC6VxIHMt33o/t73iHQRtokb1OiMz8l1Ga7qzID35hNiNn4VbKMHvWX1vTGFiCwZa6VDMv2U3U47YSIINzC95NwBjNDgBcwbD3lKjHtZQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VtwGoMKlLchC9swiGRV48AfCvSbSKvxFrW7kxONUYeMzKdyRoYoJgdsJVMCOYhM7tDkKOk65GA8DXyBTzrTLGKW75dHU0kdBaAD4j0rXP3HkdM7lWGLZAXMIHx/NMivjHiF7EScXVkBNlbRX+sx5DStdDdgDHQx+aXB6a4KOWg+0TlAmIS4RdEwsVzLlH8UTXto58KfiLDbFSjkKLv8Z8nCAkTT4pu67Pd1aVEnhT42CyjV7L70WGkbbIxM2+KflsuTPuSM1jPkpXJPcctXGsv3KB9x9mcmv1WCOgMDSyp3ArvV4yUHq4JmcfQz2Gkyh0HZ7kCVanQ7L8h1/zwZJAw==
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, <julien@xxxxxxx>, <sstabellini@xxxxxxxxxx>, <oleksandr_tyshchenko@xxxxxxxx>, <volodymyr_babchuk@xxxxxxxx>, <artem_mygaiev@xxxxxxxx>, <jbeulich@xxxxxxxx>, <andrew.cooper3@xxxxxxxxxx>, <george.dunlap@xxxxxxxxxx>, <paul@xxxxxxx>, <bertrand.marquis@xxxxxxx>, <rahul.singh@xxxxxxx>, Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
  • Delivery-date: Tue, 08 Feb 2022 09:26:10 +0000
  • Ironport-data: A9a23:5RQB460y969YI9WTOvbD5e93kn2cJEfYwER7XKvMYLTBsI5bpzEEz mtLCj/XOfncNjShfdAkPoSyp0MPvJTSz9A1SwNkpC1hF35El5HIVI+TRqvS04J+DSFhoGZPt Zh2hgzodZhsJpPkS5PE3oHJ9RGQ74nRLlbHILOCanAZqTNMEn9700o5w75h2+aEvPDia++zk YKqyyHgEAfNNw5cagr4PIra9XuDFNyr0N8plgRWicJj5TcypFFMZH4rHomjLmOQf2VhNrXSq 9Avbl2O1jixEx8FUrtJm1tgG6EAaua60QOm0hK6V0U+6/TrS+NbPqsTbZIhhUlrZzqhoo53l e1PtpqMWV1zEfPhltoRQxdjDHQrVUFG0OevzXmXtMWSywvNcmf2wuUoB0YzVWEa0r8pWycUr 6VecW1TKEDY7w616OvTpu1EnMMsIdOtJIoCknph0SvYHbAtRpWrr6Diu4QHhWls3pwm8fD2W NsiWzY1QBP6chxma10cAaAVs+avvyyqG9FfgA3M/vdmi4TJ9yRp2aXpGMrYfJqNX8o9tkSSq 3/C/m/5KgoHL9HZwj2AmlqlnPPCmBT+SY0bFbCm3vNyiVjVzWsWYDURUVa4uvC/hlSJUtRTM VEP+iEuoK4x82SmVtD4GRa/pRaspBccRt4WCOw85wGlw7DRpQ2eAwAsRDNbdMYvssNwQDUwz 0KIhPvgHzkpu7qQIVqf67OVoDWaKSUTa2gYakcsXQYDptXuvow3phbOVcp4Vr64iMXvHjP9y CzMqzIx750ai8IRjf3jpXjIhjutot7CSQtdzhvQWmWp/wZofrmvboaj6UXYxftYJYPfRV6E1 FADn8Wd9+kIAYu6iD2WQO4NEbeq4N6IKDTZx1VoGvEJ7C+x8nSueYRR5jBWJ0pzNMsAPzjzb yfuVRh5vcEJeiHwNOkuPtz3W59CIbXc+crNa67SM+JMYZ5NZFXd039Da3Ccz27Wjx15+U0gA quzfcGpBHccLK1oyjuqWusQuYMWKjACKXD7Hs6ikUn+uVaKTDvMEOpebgPSBgwsxP7c+G3oH 8Bj29xmIvm1eMn3eWHp/IEaNjjmxlBrVMmt+6S7mgNuSzeK+V3N6deMm9vNmKQ/xsy5c9skG VnnCydlJKLX3yGvFOlzQikLhEnTdZh+t2knGicnIEyl3XMuCa72svtDLspuJOd2rrU4pRKRc xXiU5/Rasmjtxydo2hNBXUDhNAKmOuXafKmYHP+PWlXk29ITA3V4N70FjYDBwFVZhdbQfAW+ uX6viuCGMJrb107UK7+NaL+p3vs7CN1sL8jACPgfIIJEG2yq9cCFsAEpqJuSy36AU6YnWXyO sf/KUpwmNQhVKdvqIaQ3v3f/97yewa8d2IDd1TmAX+NHXCy1kKowJNaUfbOejbYVWju/76la /kTxPb5WMDrVn4T22alO7o0n684+fX1oLpWklZtEHnRNgz5AbJ8OHiWm8JIs/QVlLNevAK3X GOJ+8VbZurVaJ+0TgZJKVp3dPmH2NEVhiLWsaY/LnLl6XIl57GAS0hTYUWB0XQPMLtvPYo56 u49o8pKuRengx8nP4/e3CBZ/miBNFIaVKAjus1ICYPnkFNzmFpDfYbdGmn955TWM4dANUwjI zm1gqvehusDmhqeIiRrTXWUhLhTn5UDvhxO3WQuHVXRl4qXnOIz0T1Q7S8zElZfwCJY3r8hI WNsLUB0e/mDpm86mMhZUmmwMAhdHxnFqFfpwl4EmWCFHUmlUmvBcD80NeqXpR1L9mtden5Q/ a2CyXajWjHvJZmj0iw3UE9jivriUd0uqVGSxJH5R5yIT8sgfD7ooq6yfm5Z+RLoDPQ4iFDDu eQ3rv17brf2NHJIrqA2Y2VAOW/8lPxQyLR+fMxc
  • Ironport-hdrordr: A9a23:ueDOzKOF7aEm2MBcTsWjsMiBIKoaSvp037BN7TEXdfU1SL39qy nKpp8mPHDP5Ar5NEtOpTniAsm9qBHnm6KdiLN5Vd3OYOCMggqVBbAnwYz+wyDxXw3Sn9QtsJ uIqpIOa+EY22IK7/rH3A==
  • Ironport-sdr: duqTkEAWY6JC39m+fQ2zp6vkPdZA/0V78He+7wYhiHchIX5vbuh7ptFjeq/d+hx15D+opqqphI 2Fa5J/H4wwpywvUDrMV0q+7WhZgEb6A9CeAeNqKACbZsDKvJ2oD5lgzsMEKU4QC4GfXGzoQZ93 lCm/B4ILgHGoOsLlobVEMySUaByBilCOsn0+UmtywLo1I+TvrlcFu6WKECMsK5bejThQ3ccGdk Qgzp2LZNsfQb9iDFhmgWSEEZf9NvQ00QjbvdP0SBNCGW97/vE3opDIL/pLKqNvZC/esP5r3yLr f4oM2y1LR62V3A0qmOrBAYPD
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Feb 04, 2022 at 08:34:52AM +0200, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> 
> Add relevant vpci register handlers when assigning PCI device to a domain
> and remove those when de-assigning. This allows having different
> handlers for different domains, e.g. hwdom and other guests.
> 
> Emulate guest BAR register values: this allows creating a guest view
> of the registers and emulates size and properties probe as it is done
> during PCI device enumeration by the guest.
> 
> All empty, IO and ROM BARs for guests are emulated by returning 0 on
> reads and ignoring writes: this BARs are special with this respect as
> their lower bits have special meaning, so returning default ~0 on read
> may confuse guest OS.
> 
> Memory decoding is initially disabled when used by guests in order to
> prevent the BAR being placed on top of a RAM region.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> ---
> Since v5:
> - make sure that the guest set address has the same page offset
>   as the physical address on the host
> - remove guest_rom_{read|write} as those just implement the default
>   behaviour of the registers not being handled
> - adjusted comment for struct vpci.addr field
> - add guest handlers for BARs which are not handled and will otherwise
>   return ~0 on read and ignore writes. The BARs are special with this
>   respect as their lower bits have special meaning, so returning ~0
>   doesn't seem to be right
> Since v4:
> - updated commit message
> - s/guest_addr/guest_reg
> Since v3:
> - squashed two patches: dynamic add/remove handlers and guest BAR
>   handler implementation
> - fix guest BAR read of the high part of a 64bit BAR (Roger)
> - add error handling to vpci_assign_device
> - s/dom%pd/%pd
> - blank line before return
> Since v2:
> - remove unneeded ifdefs for CONFIG_HAS_VPCI_GUEST_SUPPORT as more code
>   has been eliminated from being built on x86
> Since v1:
>  - constify struct pci_dev where possible
>  - do not open code is_system_domain()
>  - simplify some code3. simplify
>  - use gdprintk + error code instead of gprintk
>  - gate vpci_bar_{add|remove}_handlers with CONFIG_HAS_VPCI_GUEST_SUPPORT,
>    so these do not get compiled for x86
>  - removed unneeded is_system_domain check
>  - re-work guest read/write to be much simpler and do more work on write
>    than read which is expected to be called more frequently
>  - removed one too obvious comment
> ---
>  xen/drivers/vpci/header.c | 131 +++++++++++++++++++++++++++++++++-----
>  xen/include/xen/vpci.h    |   3 +
>  2 files changed, 118 insertions(+), 16 deletions(-)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index bd23c0274d48..2620a95ff35b 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -406,6 +406,81 @@ static void bar_write(const struct pci_dev *pdev, 
> unsigned int reg,
>      pci_conf_write32(pdev->sbdf, reg, val);
>  }
>  
> +static void guest_bar_write(const struct pci_dev *pdev, unsigned int reg,
> +                            uint32_t val, void *data)
> +{
> +    struct vpci_bar *bar = data;
> +    bool hi = false;
> +    uint64_t guest_reg = bar->guest_reg;
> +
> +    if ( bar->type == VPCI_BAR_MEM64_HI )
> +    {
> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
> +        bar--;
> +        hi = true;
> +    }
> +    else
> +    {
> +        val &= PCI_BASE_ADDRESS_MEM_MASK;
> +        val |= bar->type == VPCI_BAR_MEM32 ? PCI_BASE_ADDRESS_MEM_TYPE_32
> +                                           : PCI_BASE_ADDRESS_MEM_TYPE_64;
> +        val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0;
> +    }
> +
> +    guest_reg &= ~(0xffffffffull << (hi ? 32 : 0));
> +    guest_reg |= (uint64_t)val << (hi ? 32 : 0);
> +
> +    guest_reg &= ~(bar->size - 1) | ~PCI_BASE_ADDRESS_MEM_MASK;
> +
> +    /*
> +     * Make sure that the guest set address has the same page offset
> +     * as the physical address on the host or otherwise things won't work as
> +     * expected.
> +     */
> +    if ( (guest_reg & (~PAGE_MASK & PCI_BASE_ADDRESS_MEM_MASK)) !=
> +         (bar->addr & ~PAGE_MASK) )

This is only required when !hi, but I'm fine with doing it
unconditionally as it's clearer.

> +    {
> +        gprintk(XENLOG_WARNING,
> +                "%pp: ignored BAR %zu write with wrong page offset\n",

"%pp: ignored BAR %zu write attempting to change page offset\n"

> +                &pdev->sbdf, bar - pdev->vpci->header.bars + hi);
> +        return;
> +    }
> +
> +    bar->guest_reg = guest_reg;
> +}
> +
> +static uint32_t guest_bar_read(const struct pci_dev *pdev, unsigned int reg,
> +                               void *data)
> +{
> +    const struct vpci_bar *bar = data;
> +    bool hi = false;
> +
> +    if ( bar->type == VPCI_BAR_MEM64_HI )
> +    {
> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
> +        bar--;
> +        hi = true;
> +    }
> +
> +    return bar->guest_reg >> (hi ? 32 : 0);
> +}
> +
> +static uint32_t guest_bar_ignore_read(const struct pci_dev *pdev,
> +                                      unsigned int reg, void *data)
> +{
> +    return 0;
> +}
> +
> +static int bar_ignore_access(const struct pci_dev *pdev, unsigned int reg,
> +                             struct vpci_bar *bar)
> +{
> +    if ( is_hardware_domain(pdev->domain) )
> +        return 0;
> +
> +    return vpci_add_register(pdev->vpci, guest_bar_ignore_read, NULL,
> +                             reg, 4, bar);
> +}
> +
>  static void rom_write(const struct pci_dev *pdev, unsigned int reg,
>                        uint32_t val, void *data)
>  {
> @@ -462,6 +537,7 @@ static int init_bars(struct pci_dev *pdev)
>      struct vpci_header *header = &pdev->vpci->header;
>      struct vpci_bar *bars = header->bars;
>      int rc;
> +    bool is_hwdom = is_hardware_domain(pdev->domain);
>  
>      switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
>      {
> @@ -501,8 +577,10 @@ static int init_bars(struct pci_dev *pdev)
>          if ( i && bars[i - 1].type == VPCI_BAR_MEM64_LO )
>          {
>              bars[i].type = VPCI_BAR_MEM64_HI;
> -            rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, 
> reg,
> -                                   4, &bars[i]);
> +            rc = vpci_add_register(pdev->vpci,
> +                                   is_hwdom ? vpci_hw_read32 : 
> guest_bar_read,
> +                                   is_hwdom ? bar_write : guest_bar_write,
> +                                   reg, 4, &bars[i]);
>              if ( rc )
>              {
>                  pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
> @@ -516,6 +594,11 @@ static int init_bars(struct pci_dev *pdev)
>          if ( (val & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO )
>          {
>              bars[i].type = VPCI_BAR_IO;
> +
> +            rc = bar_ignore_access(pdev, reg, &bars[i]);

This is wrong: you only want to ignore access to IO BARs for Arm, for
x86 we should keep the previous behavior. Even more if you go with
Jan's suggestions to make bar_ignore_access also applicable to dom0.

> +            if ( rc )
> +                return rc;
> +
>              continue;
>          }
>          if ( (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
> @@ -535,6 +618,11 @@ static int init_bars(struct pci_dev *pdev)
>          if ( size == 0 )
>          {
>              bars[i].type = VPCI_BAR_EMPTY;
> +
> +            rc = bar_ignore_access(pdev, reg, &bars[i]);
> +            if ( rc )
> +                return rc;

I would be fine to just call vpci_add_register here, ie;

if ( !is_hwdom )
{
    rc = vpci_add_register(pdev->vpci, guest_bar_ignore_read, NULL,
                           reg, 4, &bars[i]);
     if ( rc )
     {
         ...
     }
}

Feel free to unify the writing of the PCI_COMMAND register on the
error path into a label, as then the error case would simply be a
`goto error;`

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.