[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 10/16] vpci/header: emulate PCI_COMMAND register for guests


  • To: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 21 Sep 2023 15:18:15 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iTvr8ivJCkVHFmvpdahbV+Z3n6GpQtE+mw6m1OxmvXU=; b=KtyfDSCarsqimhhdbMG6x3bLOZgKPF+3kT+b19hkRIRG5gmY5R5pPqv+CUCJY9UfHYhKgiY7OzTPeVPnIs96EtO/ORorW4o721obzzK5KwFZFa2Xf1Uz0MeH3CRszhYhrlQT5jZR1LT9613oK1H4qvDGI8YEM7sGtgQ2apmu5UEOcUpLBvMwBFWC4EC91EVq8G0oWi0QPeHSPYg7iwD9ivEWtGujIgcOKbD5nyaTBhy3JYmnoZwwfnfWVenTRBEle6Jii+p5ps+QZkd3ZlGfJPPu8HEbmljoGAxoA0rBzYD8H6nhBiHVVItac9YC77Ut5joNvlsLeykxLeB8qc9Y5Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WbFh4PCcDwxkOKOxKGL9HRMnkqOLWYG0CoiHk5lhWU1xEaSJkhZq5o8cT/D5sHp+wdYYs3KozfUak0chSKpiU7qbxdgqr7LcoySSpQWCDtprPuR/WPdhlSWe9QnT+m737vWLx0baawSG+iK+jMzmSCjgs9u8XIPGedSZW/mgbNwvrUyXuecxkBQ/47NlM9ITJedgD8QJtb1A8FmHiTlNnKPnn1GEnalcsM2AnWZ6iNGNItF1RPf+cO7BcW+gigcDdu8iMKkTQynMTqh3UhcvebznA32wAtoVKEiN9pArTmjKynSOYTR8gbnwHZIZ95S3qL5za3ZrHjrJBUriedwbnw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stewart Hildebrand <stewart.hildebrand@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Delivery-date: Thu, 21 Sep 2023 13:19:05 +0000
  • Ironport-data: A9a23:0QNR8KDnUNAKcRVW//Liw5YqxClBgxIJ4kV8jS/XYbTApDIq12QPy TQbCmrQaP6PY2LzKt8ka42+oBhSsJWEzoAyQQY4rX1jcSlH+JHPbTi7wuUcHAvJd5GeExg3h yk6QoOdRCzhZiaE/n9BCpC48D8kk/nOH+KgYAL9EngZbRd+Tys8gg5Ulec8g4p56fC0GArIs t7pyyHlEAbNNwVcbCRMsMpvlDs15K6p4GJC5wRkDRx2lAS2e0c9Xcp3yZ6ZdxMUcqEMdsamS uDKyq2O/2+x13/B3fv8z94X2mVTKlLjFVDmZkh+AsBOsTAbzsAG6Y4pNeJ0VKtio27hc+ada jl6ncfYpQ8BZsUgkQmGOvVSO3kW0aZuoNcrLZUj2CA6IoKvn3bEmp1T4E8K0YIwx9R2OFBQx aAhMzEGRSyNwPOr3b+ic7w57igjBJGD0II3nFhFlWucJ9B/BJfJTuPN+MNS2yo2ioZWB/HCa sEFaD1pKhPdfxlIPVRRA5U79AuqriCnL3sE9xTI/OxrswA/zyQouFTpGMDSddGQA91cg26Tp 37c/nS/CRYfXDCa4WPfqCr21reewEsXXqpNTbm8ztExhmG6/Tw1UgMnf2Dkj6aQ3xvWt9V3b hZ8FjAVhZY18EunX9zsRSqSqXSPvgMfc9dIGuh84waIooLE7gDcCmUaQzppbN09qNRwVTEsz kWOnd7iGXpoqrL9YWmB+6idty/0OSkRIWIIfyYCQSMM5tWlq4Y25jroSttgC6ezgsfCMDf82 S2RrCM+irMQiuYGz6y+u1vAhlqRSoPhSwc04kDdWDuj5wYhPoq9PdT0uR7c8OpKK5ufQh+Zp n8YlsOC7ecIS5aQiCiKR+ZLF7asjxqYDADhbZdUN8FJ31yQF7SLJtk4DO1WTKuxDvs5RA==
  • Ironport-hdrordr: A9a23:naSgn6FA1rM7+vQLpLqFgpLXdLJyesId70hD6qkvc3Fom52j/f xGws5x6faVslkssb8b6LW90Y27MAvhHP9OkPAs1NKZMDUO11HJEGgP1/qA/9SkIVyEygc/79 YdT0EdMqyWMbESt6+TjmiF+pQbsb+6GciT9JrjJhxWPGVXgs9bnmVE4lHxKDwNeOAKP+tPKL Osou584xawc3Ueacq2QlEDQuj4vtXO0L72fBIcABYjyQ+WyRel8qTzHRS01goXF2on+8ZozU H11yjCoomzufCyzRHRk0fV8pRtgdPkjvdTGcCWjcARCzP0ziKlfp5oVbGutC085Muv9FEput /RpApIBbU611rhOkWO5Tf90Qjp1zgjr1fk1F+jmHPm5ej0XigzBcZtjZ9QNkKx0TtogPhMlI Zwm06JvZteCh3N2Az7+tjzThlv0m65u2Arn+I/h2FWFaEedLhSh4oC+149KuZ3IAvKrKQcVM V+BsDV4/hbNXuccnDip2FqhOehW3widy32MHQqi4iw6Xx7jXp5x0wXyIg0hXEb7q8wTJFC+q DtLrlovKsmdL5YUYtNQMM6BeenAG3ERhzBdEiIJ078Ka0BM3XR77bq/bQO4v2wcpBg9up/pH 34aiIYiYcOQTOvNSXXt6c7sSwlAV/NEAgF8/suqaSQ4dbHNfjW2S7qciFcryLvmYRbPiThYY fDBHtnOY6dEYLQI/c24+TfYegmFZBMarxghv8LH3Szn+nsFqrG8sTmTde7HsucLd9jYBK0Pk c+
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Aug 29, 2023 at 11:19:44PM +0000, Volodymyr Babchuk wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> 
> Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
> to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
> guest's view of this will want to be zero initially, the host having set
> it to 1 may not easily be overwritten with 0, or else we'd effectively
> imply giving the guest control of the bit. Thus, PCI_COMMAND register needs
> proper emulation in order to honor host's settings.
> 
> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
> Device Control" the reset state of the command register is typically 0,
> so when assigning a PCI device use 0 as the initial state for the guest's view
> of the command register.
> 
> Here is the full list of command register bits with notes about
> emulation:
> 
> PCI_COMMAND_IO - Allow guest to control it.
> PCI_COMMAND_MEMORY - Already handled.
> PCI_COMMAND_MASTER - Allow guest to control it.
> PCI_COMMAND_SPECIAL - Guest can generate special cycles only if it has
> access to host bridge that supports software generation of special
> cycles. In our case guest has no access to host bridges at all. Value
> after reset is 0.
> PCI_COMMAND_INVALIDATE - Allows "Memory Write and Invalidate" commands
> to be generated. It requires additional configuration via Cacheline
> Size register. We are not emulating this register right now and we
> can't expect guest to properly configure it.
> PCI_COMMAND_VGA_PALETTE - Enable VGA palette snooping. This bit is set
> by firmware and we want to leave it as is.
> PCI_COMMAND_PARITY - Controls how device response to parity
> errors. We want this bit to be set by a hardware domain.
> PCI_COMMAND_WAIT - Reserved. Should be 0.
> PCI_COMMAND_SERR - Controls if device can assert SERR.
> The same as for COMMAND_PARITY.
> PCI_COMMAND_FAST_BACK - Optional bit that allows fast back-to-back
> transactions. It is configured by firmware, so we don't want guest to
> control it.
> PCI_COMMAND_INTX_DISABLE - Disables INTx signals. If MSI(X) is
> enabled, device is prohibited from asserting INTx. Value after reset
> is 0. Guest can control it freely.

I'm kind of confused by the text above, does "Guest can control it
freely" imply that the guest is able to modify the bit in the device
command register vs the emulated view that we provide?  If so
INTX_DISABLE should not be allowed direct guest modification.

I'm thinking that we might want to allow guest access to the first 3
bits only, while the rest of the values won't be propagated to
hardware, iow: guest will get a fake view of them.

Have you checked with QEMU how are those bits handled?  That's our
current passthrough reference implementation, and we should aim to
handle those similarly in vPCI.

> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>
> ---
> Since v9:
> - Reworked guest_cmd_read
> - Added handling for more bits
> Since v6:
> - fold guest's logic into cmd_write
> - implement cmd_read, so we can report emulated INTx state to guests
> - introduce header->guest_cmd to hold the emulated state of the
>   PCI_COMMAND register for guests
> Since v5:
> - add additional check for MSI-X enabled while altering INTX bit
> - make sure INTx disabled while guests enable MSI/MSI-X
> Since v3:
> - gate more code on CONFIG_HAS_MSI
> - removed logic for the case when MSI/MSI-X not enabled
> ---
>  xen/drivers/vpci/header.c | 54 ++++++++++++++++++++++++++++++++++++---
>  xen/drivers/vpci/msi.c    | 10 ++++++++
>  xen/drivers/vpci/msix.c   |  4 +++
>  xen/include/xen/vpci.h    |  3 +++
>  4 files changed, 67 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 1e82217200..e351db4620 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -502,14 +502,37 @@ static int modify_bars(const struct pci_dev *pdev, 
> uint16_t cmd, bool rom_only)
>      return 0;
>  }
>  
> +/* TODO: Add proper emulation for all bits of the command register. */
>  static void cf_check cmd_write(
>      const struct pci_dev *pdev, unsigned int reg, uint32_t cmd, void *data)
>  {
>      struct vpci_header *header = data;
>  
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        if ( IS_ENABLED(CONFIG_HAS_PCI_MSI) )
> +        {
> +            /* Tell guest that device does not support this */
> +            cmd &= ~PCI_COMMAND_FAST_BACK;
> +        }
> +
> +        header->guest_cmd = cmd;
> +
> +        if ( IS_ENABLED(CONFIG_HAS_PCI_MSI) )
> +        {
> +            /* Do not touch INVALIDATE, PARITY and SERR */
> +            const uint16_t excluded = PCI_COMMAND_INVALIDATE |
> +                PCI_COMMAND_PARITY | PCI_COMMAND_SERR;
> +
> +            cmd &= ~excluded;
> +            cmd |= pci_conf_read16(pdev->sbdf, reg) & excluded;
> +        }

I'm not following why allowing guest setting of those bits is
conditional on HAS_PCI_MSI being build time enabled.

Isn't it equally good or bad to let the guest play with certain bits
regardless of Xen build time configuration?

As said above, I would look at QEMU and see how bits are handled
there.

> +    }
> +
>      /*
> -     * Let Dom0 play with all the bits directly except for the memory
> -     * decoding one.
> +     * Let guest play with all the bits directly except for the memory
> +     * decoding one. Bits that are not allowed for DomU are already
> +     * handled above.
>       */
>      if ( header->bars_mapped != !!(cmd & PCI_COMMAND_MEMORY) )
>          /*
> @@ -523,6 +546,14 @@ static void cf_check cmd_write(
>          pci_conf_write16(pdev->sbdf, reg, cmd);
>  }
>  
> +static uint32_t guest_cmd_read(const struct pci_dev *pdev, unsigned int reg,
> +                               void *data)
> +{
> +    const struct vpci_header *header = data;
> +
> +    return header->guest_cmd;
> +}
> +
>  static void cf_check bar_write(
>      const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
>  {
> @@ -732,8 +763,12 @@ static int cf_check init_bars(struct pci_dev *pdev)
>      }
>  
>      /* Setup a handler for the command register. */
> -    rc = vpci_add_register(pdev->vpci, vpci_hw_read16, cmd_write, 
> PCI_COMMAND,
> -                           2, header);
> +    if ( is_hwdom )
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read16, cmd_write, 
> PCI_COMMAND,
> +                               2, header);
> +    else
> +        rc = vpci_add_register(pdev->vpci, guest_cmd_read, cmd_write, 
> PCI_COMMAND,
> +                               2, header);

You have used the ternary operator in other places, I would recommend
to also do it here to avoid line duplication.

>      if ( rc )
>          return rc;
>  
> @@ -745,6 +780,17 @@ static int cf_check init_bars(struct pci_dev *pdev)
>      if ( cmd & PCI_COMMAND_MEMORY )
>          pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd & ~PCI_COMMAND_MEMORY);
>  
> +    header->guest_cmd = cmd & ~PCI_COMMAND_MEMORY;

Memory Enable is cleared from the guest view, yet at the end of
init_bars() mappings will be established if the bit is enabled in cmd.
Won't this create a mismatch between the guest view and the contents
of the physmap?

> +
> +    /*
> +     * According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
> +     * Device Control" the reset state of the command register is
> +     * typically all 0's, so this is used as initial value for the guests.
> +     */
> +    if ( header->guest_cmd != 0 )
> +        gprintk(XENLOG_WARNING, "%pp: CMD is not zero: %x", &pdev->sbdf,
> +                header->guest_cmd);

I think it's unlikely for the command register to be zeroed out, I
haven't looked, but I would assume that after a device reset by
pciback it won't be unlikely for some initial state to be set.

> +
>      for ( i = 0; i < num_bars; i++ )
>      {
>          uint8_t reg = PCI_BASE_ADDRESS_0 + i * 4;
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index a0733bb2cb..df0f0199b8 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -70,6 +70,16 @@ static void cf_check control_write(
>  
>          if ( vpci_msi_arch_enable(msi, pdev, vectors) )
>              return;
> +
> +        /*
> +         * Make sure guest doesn't enable INTx while enabling MSI.
> +         * Opposite action (enabling INTx) will be performed in
> +         * vpci_msi_arch_disable call path.

I'm not seeing such code in vpci_msi_arch_disable().  However the
updating of the INTX field should be done after MSI(X) has been
disabled, and hence can only be done after the pci_conf_write16() in
control_write().

I would be fine if you want to leave forcing the setting of INTX to
enabled once MSI has been disabled, any sane guest will do that
itself, but the comment needs updating.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.