[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] IOMMU: make DMA containment of quarantined devices optional



On Wed, Dec 11, 2019 at 01:53:00PM +0100, Jan Beulich wrote:
> Containing still in flight DMA was introduced to work around certain
> devices / systems hanging hard upon hitting an IOMMU fault. Passing
> through (such) devices (on such systems) is inherently insecure (as
> guests could easily arrange for IOMMU faults to occur). Defaulting to
> a mode where admins may not even become aware of issues with devices can
> be considered undesirable. Therefore convert this mode of operation to
> an optional one, not one enabled by default.
> 
> This involves resurrecting code commit ea38867831da ("x86 / iommu: set
> up a scratch page in the quarantine domain") did remove, in a slightly
> extended fashion. Here, instead of reintroducing a pretty pointless use
> of "goto" in domain_context_unmap(), and instead of making the function
> (at least temporarily) inconsistent, take the opportunity and replace
> the other similarly pointless "goto" as well.
> 
> In order to key the re-instated bypasses off of there (not) being a root
> page table this further requires moving the allocate_domain_resources()
> invocation from reassign_device() to amd_iommu_setup_domain_device() (or
> else reassign_device() would allocate a root page table anyway); this is
> benign to the second caller of the latter function.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> I'm happy to take better suggestions to replace "full".
> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1232,7 +1232,7 @@ detection of systems known to misbehave
>  > Default: `new` unless directed-EOI is supported
>  
>  ### iommu
> -    = List of [ <bool>, verbose, debug, force, required, quarantine,
> +    = List of [ <bool>, verbose, debug, force, required, quarantine[=full],
>                  sharept, intremap, intpost, crash-disable,
>                  snoop, qinval, igfx, amd-iommu-perdev-intremap,
>                  dom0-{passthrough,strict} ]
> @@ -1270,11 +1270,13 @@ boolean (e.g. `iommu=no`) can override t
>      will prevent Xen from booting if IOMMUs aren't discovered and enabled
>      successfully.
>  
> -*   The `quarantine` boolean can be used to control Xen's behavior when
> -    de-assigning devices from guests.  If enabled (the default), Xen always
> -    quarantines such devices; they must be explicitly assigned back to Dom0
> -    before they can be used there again.  If disabled, Xen will only
> -    quarantine devices the toolstack hass arranged for getting quarantined.
> +*   The `quarantine` option can be used to control Xen's behavior when
> +    de-assigning devices from guests.  If set to true (the default), Xen
> +    always quarantines such devices; they must be explicitly assigned back
> +    to Dom0 before they can be used there again.  If set to "full", still
> +    active DMA will additionally be directed to a "sink" page.  If set to
> +    false, Xen will only quarantine devices the toolstack has arranged for
> +    getting quarantined.
>  
>  *   The `sharept` boolean controls whether the IOMMU pagetables are shared
>      with the CPU-side HAP pagetables, or allocated separately.  Sharing
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -85,18 +85,36 @@ int get_dma_requestor_id(uint16_t seg, u
>      return req_id;
>  }
>  
> -static void amd_iommu_setup_domain_device(
> +static int __must_check allocate_domain_resources(struct domain_iommu *hd)
> +{
> +    int rc;
> +
> +    spin_lock(&hd->arch.mapping_lock);
> +    rc = amd_iommu_alloc_root(hd);
> +    spin_unlock(&hd->arch.mapping_lock);
> +
> +    return rc;
> +}
> +
> +static int __must_check amd_iommu_setup_domain_device(
>      struct domain *domain, struct amd_iommu *iommu,
>      uint8_t devfn, struct pci_dev *pdev)
>  {
>      struct amd_iommu_dte *table, *dte;
>      unsigned long flags;
> -    int req_id, valid = 1;
> +    int req_id, valid = 1, rc;
>      u8 bus = pdev->bus;
> -    const struct domain_iommu *hd = dom_iommu(domain);
> +    struct domain_iommu *hd = dom_iommu(domain);
> +
> +    /* dom_io is used as a sentinel for quarantined devices */
> +    if ( domain == dom_io && !hd->arch.root_table )

This condition (and it's Intel counterpart) would be better in a macro
IMO, so that vendor code regardless of the implementation can use the
same macro (and to avoid having to add the same comment in all
instances), ie: IS_DEVICE_QUARANTINED or some such would be fine IMO.

> +        return 0;
> +
> +    BUG_ON(!hd->arch.paging_mode || !iommu->dev_table.buffer);
>  
> -    BUG_ON( !hd->arch.root_table || !hd->arch.paging_mode ||
> -            !iommu->dev_table.buffer );
> +    rc = allocate_domain_resources(hd);
> +    if ( rc )
> +        return rc;
>  
>      if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
>          valid = 0;
> @@ -151,6 +169,8 @@ static void amd_iommu_setup_domain_devic
>  
>          amd_iommu_flush_iotlb(devfn, pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0);
>      }
> +
> +    return 0;
>  }
>  
>  int __init acpi_ivrs_init(void)
> @@ -220,17 +240,6 @@ int amd_iommu_alloc_root(struct domain_i
>      return 0;
>  }
>  
> -static int __must_check allocate_domain_resources(struct domain_iommu *hd)
> -{
> -    int rc;
> -
> -    spin_lock(&hd->arch.mapping_lock);
> -    rc = amd_iommu_alloc_root(hd);
> -    spin_unlock(&hd->arch.mapping_lock);
> -
> -    return rc;
> -}
> -
>  int amd_iommu_get_paging_mode(unsigned long entries)
>  {
>      int level = 1;
> @@ -287,6 +296,10 @@ static void amd_iommu_disable_domain_dev
>      int req_id;
>      u8 bus = pdev->bus;
>  
> +    /* dom_io is used as a sentinel for quarantined devices */
> +    if ( domain == dom_io && !dom_iommu(domain)->arch.root_table )
> +        return;
> +
>      BUG_ON ( iommu->dev_table.buffer == NULL );
>      req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(bus, devfn));
>      table = iommu->dev_table.buffer;
> @@ -333,7 +346,6 @@ static int reassign_device(struct domain
>  {
>      struct amd_iommu *iommu;
>      int bdf, rc;
> -    struct domain_iommu *t = dom_iommu(target);
>  
>      bdf = PCI_BDF2(pdev->bus, pdev->devfn);
>      iommu = find_iommu_for_device(pdev->seg, bdf);
> @@ -354,11 +366,10 @@ static int reassign_device(struct domain
>          pdev->domain = target;
>      }
>  
> -    rc = allocate_domain_resources(t);
> +    rc = amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
>      if ( rc )
>          return rc;
>  
> -    amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
>      AMD_IOMMU_DEBUG("Re-assign %04x:%02x:%02x.%u from dom%d to dom%d\n",
>                      pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
>                      source->domain_id, target->domain_id);
> @@ -515,8 +526,7 @@ static int amd_iommu_add_device(u8 devfn
>          spin_unlock_irqrestore(&iommu->lock, flags);
>      }
>  
> -    amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
> -    return 0;
> +    return amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
>  }
>  
>  static int amd_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -30,13 +30,17 @@ bool_t __initdata iommu_enable = 1;
>  bool_t __read_mostly iommu_enabled;
>  bool_t __read_mostly force_iommu;
>  bool_t __read_mostly iommu_verbose;
> -bool __read_mostly iommu_quarantine = true;
>  bool_t __read_mostly iommu_igfx = 1;
>  bool_t __read_mostly iommu_snoop = 1;
>  bool_t __read_mostly iommu_qinval = 1;
>  bool_t __read_mostly iommu_intremap = 1;
>  bool_t __read_mostly iommu_crash_disable;
>  
> +#define IOMMU_quarantine_none  false
> +#define IOMMU_quarantine_basic true
> +#define IOMMU_quarantine_full  2
> +uint8_t __read_mostly iommu_quarantine = IOMMU_quarantine_basic;

I don't really like to use booleans with non-boolean variables.
Wouldn't it be better to just use plain numbers, or even better an
enum?

> +
>  static bool __hwdom_initdata iommu_hwdom_none;
>  bool __hwdom_initdata iommu_hwdom_strict;
>  bool __read_mostly iommu_hwdom_passthrough;
> @@ -81,6 +85,8 @@ static int __init parse_iommu_param(cons
>              force_iommu = val;
>          else if ( (val = parse_boolean("quarantine", s, ss)) >= 0 )
>              iommu_quarantine = val;
> +        else if ( ss == s + 15 && !strncmp(s, "quarantine=full", 15) )
> +            iommu_quarantine = IOMMU_quarantine_full;
>          else if ( (val = parse_boolean("igfx", s, ss)) >= 0 )
>              iommu_igfx = val;
>          else if ( (val = parse_boolean("verbose", s, ss)) >= 0 )
> @@ -451,7 +457,7 @@ static int __init iommu_quarantine_init(
>      dom_io->options |= XEN_DOMCTL_CDF_iommu;
>  
>      rc = iommu_domain_init(dom_io, 0);
> -    if ( rc )
> +    if ( rc || iommu_quarantine < IOMMU_quarantine_full )
>          return rc;
>  
>      if ( !hd->platform_ops->quarantine_init )
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1291,6 +1291,10 @@ int domain_context_mapping_one(
>      int agaw, rc, ret;
>      bool_t flush_dev_iotlb;
>  
> +    /* dom_io is used as a sentinel for quarantined devices */
> +    if ( domain == dom_io && !hd->arch.pgd_maddr )
> +        return 0;
> +
>      ASSERT(pcidevs_locked());
>      spin_lock(&iommu->lock);
>      maddr = bus_to_context_maddr(iommu, bus);
> @@ -1537,6 +1541,10 @@ int domain_context_unmap_one(
>      int iommu_domid, rc, ret;
>      bool_t flush_dev_iotlb;
>  
> +    /* dom_io is used as a sentinel for quarantined devices */
> +    if ( domain == dom_io && !dom_iommu(domain)->arch.pgd_maddr )
> +        return 0;
> +
>      ASSERT(pcidevs_locked());
>      spin_lock(&iommu->lock);
>  
> @@ -1598,7 +1606,7 @@ static int domain_context_unmap(struct d
>  {
>      struct acpi_drhd_unit *drhd;
>      struct vtd_iommu *iommu;
> -    int ret = 0;
> +    int ret;
>      u8 seg = pdev->seg, bus = pdev->bus, tmp_bus, tmp_devfn, secbus;
>      int found = 0;
>  
> @@ -1614,14 +1622,12 @@ static int domain_context_unmap(struct d
>              printk(VTDPREFIX "d%d:Hostbridge: skip %04x:%02x:%02x.%u 
> unmap\n",
>                     domain->domain_id, seg, bus,
>                     PCI_SLOT(devfn), PCI_FUNC(devfn));
> -        if ( !is_hardware_domain(domain) )
> -            return -EPERM;
> -        goto out;
> +        return is_hardware_domain(domain) ? 0 : -EPERM;
>  
>      case DEV_TYPE_PCIe_BRIDGE:
>      case DEV_TYPE_PCIe2PCI_BRIDGE:
>      case DEV_TYPE_LEGACY_PCI_BRIDGE:
> -        goto out;
> +        return 0;
>  
>      case DEV_TYPE_PCIe_ENDPOINT:
>          if ( iommu_debug )
> @@ -1665,10 +1671,13 @@ static int domain_context_unmap(struct d
>          dprintk(XENLOG_ERR VTDPREFIX, "d%d:unknown(%u): %04x:%02x:%02x.%u\n",
>                  domain->domain_id, pdev->type,
>                  seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
> -        ret = -EINVAL;
> -        goto out;
> +        return -EINVAL;
>      }
>  
> +    /* dom_io is used as a sentinel for quarantined devices */
> +    if ( domain == dom_io && !dom_iommu(domain)->arch.pgd_maddr )
> +        return ret;
> +
>      /*
>       * if no other devices under the same iommu owned by this domain,
>       * clear iommu in iommu_bitmap and clear domain_id in domid_bitmp
> @@ -1694,16 +1703,12 @@ static int domain_context_unmap(struct d
>  
>          iommu_domid = domain_iommu_domid(domain, iommu);
>          if ( iommu_domid == -1 )
> -        {
> -            ret = -EINVAL;
> -            goto out;
> -        }
> +            return -EINVAL;
>  
>          clear_bit(iommu_domid, iommu->domid_bitmap);
>          iommu->domid_map[iommu_domid] = 0;
>      }
>  
> -out:
>      return ret;
>  }
>  
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -53,8 +53,9 @@ static inline bool_t dfn_eq(dfn_t x, dfn
>  }
>  
>  extern bool_t iommu_enable, iommu_enabled;
> -extern bool force_iommu, iommu_quarantine, iommu_verbose, iommu_igfx;
> +extern bool force_iommu, iommu_verbose, iommu_igfx;
>  extern bool_t iommu_snoop, iommu_qinval, iommu_intremap, iommu_intpost;
> +extern uint8_t iommu_quarantine;

Exporting this variable without the paired defines seems pointless,
how are external callers supposed to figure out the quarantine mode
without the IOMMU_quarantine_* defines?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.