[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional
> From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: Monday, March 9, 2020 7:09 PM > > Containing still in flight DMA was introduced to work around certain > devices / systems hanging hard upon hitting a "not-present" IOMMU fault. > Passing through (such) devices (on such systems) is inherently insecure > (as guests could easily arrange for IOMMU faults of any kind to occur). > Defaulting to a mode where admins may not even become aware of issues > with devices can be considered undesirable. Therefore convert this mode > of operation to an optional one, not one enabled by default. > > This involves resurrecting code commit ea38867831da ("x86 / iommu: set > up a scratch page in the quarantine domain") did remove, in a slightly > extended and abstracted fashion. Here, instead of reintroducing a pretty > pointless use of "goto" in domain_context_unmap(), and instead of making > the function (at least temporarily) inconsistent, take the opportunity > and replace the other similarly pointless "goto" as well. > > In order to key the re-instated bypasses off of there (not) being a root > page table this further requires moving the allocate_domain_resources() > invocation from reassign_device() to amd_iommu_setup_domain_device() > (or > else reassign_device() would allocate a root page table anyway); this is > benign to the second caller of the latter function. > > Take the opportunity and also limit the control to builds supporting > PCI. > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > --- > I'm happy to take better suggestions to replace the "full" command line > option and Kconfig prompt tokens. I don't think though that "fault" and > "write-fault" are really suitable there. I think we may just allow both r/w access to scratch page for such bogus device, which may make 'full' more reasonable since we now fully contain in-fly DMAs. I'm not sure about the value of keeping write-fault alone for such devices (just because one observed his specific device only has problem with read-fault). alternatively I also thought about whether whitelisting the problematic devices through another option (e.g. nofault=b:d:f) could provide more value. In concept any IOMMU page table (dom0, dom_io or domU) for such bogus device should not include invalid entry, even when quarantine is not specified. However I'm not sure whether it's worthy of going so far... > --- > This patch contextually depends on "[PATCH v2 0/5] IOMMU: restrict > visibility/scope if certain variables". > --- > v3: IOMMU_quarantine_basic -> IOMMU_quarantine_fault, > IOMMU_quarantine_full -> IOMMU_quarantine_write_fault. Kconfig > option (choice) to select default. Limit to HAS_PCI. > v2: Don't use true/false. Introduce QUARANTINE_SKIP() (albeit I'm not > really convinced this is an improvement). Add comment. > > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -1238,7 +1238,7 @@ detection of systems known to misbehave > > Default: `new` unless directed-EOI is supported > > ### iommu > - = List of [ <bool>, verbose, debug, force, required, quarantine, > + = List of [ <bool>, verbose, debug, force, required, quarantine[=full], > sharept, intremap, intpost, crash-disable, > snoop, qinval, igfx, amd-iommu-perdev-intremap, > dom0-{passthrough,strict} ] > @@ -1276,11 +1276,15 @@ boolean (e.g. `iommu=no`) can override t > will prevent Xen from booting if IOMMUs aren't discovered and enabled > successfully. > > -* The `quarantine` boolean can be used to control Xen's behavior when > - de-assigning devices from guests. If enabled (the default), Xen always > - quarantines such devices; they must be explicitly assigned back to Dom0 > - before they can be used there again. If disabled, Xen will only > - quarantine devices the toolstack hass arranged for getting quarantined. > +* The `quarantine` option can be used to control Xen's behavior when > + de-assigning devices from guests. If set to true (the default), Xen > + always quarantines such devices; they must be explicitly assigned back > + to Dom0 before they can be used there again. If set to "full", still > + active DMA will additionally be directed to a "sink" page. If set to > + false, Xen will only quarantine devices the toolstack has arranged for > + getting quarantined. > + > + This option is only valid on builds supporting PCI. > > * The `sharept` boolean controls whether the IOMMU pagetables are > shared > with the CPU-side HAP pagetables, or allocated separately. Sharing > --- a/xen/drivers/passthrough/Kconfig > +++ b/xen/drivers/passthrough/Kconfig > @@ -28,3 +28,31 @@ endif > > config IOMMU_FORCE_PT_SHARE > bool > + > +choice > + prompt "IOMMU device quarantining default behavior" > + depends on HAS_PCI > + default IOMMU_QUARANTINE_BASIC > + ---help--- > + When a PCI device is assigned to an untrusted domain, it is possible > + for that domain to program the device to DMA to an arbitrary > address. > + The IOMMU is used to protect the host from malicious DMA by > making > + sure that the device addresses can only target memory assigned to > the > + guest. However, when the guest domain is torn down, assigning the > + device back to the hardware domain would allow any in-flight DMA > to > + potentially target critical host data. To avoid this, quarantining > + shold be enabled. Quarantining can be done in two ways: In its > basic > + form, all in-flight DMA will simply be forced to encounter IOMMU > + faults. Since there are systems where doing so can cause host > + lockup, an alternative form is available where writes to memory will > + be made fault, but reads will be directed to a dummy page. The > + implication here is that such reads will go unnoticed, i.e. an admin > + may not become aware of the underlying problem. > + > + config IOMMU_QUARANTINE_NONE > + bool "none" > + config IOMMU_QUARANTINE_BASIC > + bool "basic" > + config IOMMU_QUARANTINE_FULL > + bool "full" > +endchoice > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -25,6 +25,9 @@ > #include "iommu.h" > #include "../ats.h" > > +/* dom_io is used as a sentinel for quarantined devices */ > +#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)- > >arch.root_table) > + > static bool_t __read_mostly init_done; > > static const struct iommu_init_ops _iommu_init_ops; > @@ -82,18 +85,35 @@ int get_dma_requestor_id(uint16_t seg, u > return req_id; > } > > -static void amd_iommu_setup_domain_device( > +static int __must_check allocate_domain_resources(struct domain_iommu > *hd) > +{ > + int rc; > + > + spin_lock(&hd->arch.mapping_lock); > + rc = amd_iommu_alloc_root(hd); > + spin_unlock(&hd->arch.mapping_lock); > + > + return rc; > +} > + > +static int __must_check amd_iommu_setup_domain_device( > struct domain *domain, struct amd_iommu *iommu, > uint8_t devfn, struct pci_dev *pdev) > { > struct amd_iommu_dte *table, *dte; > unsigned long flags; > - int req_id, valid = 1; > + int req_id, valid = 1, rc; > u8 bus = pdev->bus; > - const struct domain_iommu *hd = dom_iommu(domain); > + struct domain_iommu *hd = dom_iommu(domain); > + > + if ( QUARANTINE_SKIP(domain) ) > + return 0; > + > + BUG_ON(!hd->arch.paging_mode || !iommu->dev_table.buffer); > > - BUG_ON( !hd->arch.root_table || !hd->arch.paging_mode || > - !iommu->dev_table.buffer ); > + rc = allocate_domain_resources(hd); > + if ( rc ) > + return rc; > > if ( iommu_hwdom_passthrough && is_hardware_domain(domain) ) > valid = 0; > @@ -148,6 +168,8 @@ static void amd_iommu_setup_domain_devic > > amd_iommu_flush_iotlb(devfn, pdev, > INV_IOMMU_ALL_PAGES_ADDRESS, 0); > } > + > + return 0; > } > > int __init acpi_ivrs_init(void) > @@ -217,17 +239,6 @@ int amd_iommu_alloc_root(struct domain_i > return 0; > } > > -static int __must_check allocate_domain_resources(struct domain_iommu > *hd) > -{ > - int rc; > - > - spin_lock(&hd->arch.mapping_lock); > - rc = amd_iommu_alloc_root(hd); > - spin_unlock(&hd->arch.mapping_lock); > - > - return rc; > -} > - > int amd_iommu_get_paging_mode(unsigned long entries) > { > int level = 1; > @@ -291,6 +302,9 @@ static void amd_iommu_disable_domain_dev > int req_id; > u8 bus = pdev->bus; > > + if ( QUARANTINE_SKIP(domain) ) > + return; > + > BUG_ON ( iommu->dev_table.buffer == NULL ); > req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(bus, devfn)); > table = iommu->dev_table.buffer; > @@ -337,7 +351,6 @@ static int reassign_device(struct domain > { > struct amd_iommu *iommu; > int bdf, rc; > - struct domain_iommu *t = dom_iommu(target); > > bdf = PCI_BDF2(pdev->bus, pdev->devfn); > iommu = find_iommu_for_device(pdev->seg, bdf); > @@ -358,11 +371,10 @@ static int reassign_device(struct domain > pdev->domain = target; > } > > - rc = allocate_domain_resources(t); > + rc = amd_iommu_setup_domain_device(target, iommu, devfn, pdev); > if ( rc ) > return rc; > > - amd_iommu_setup_domain_device(target, iommu, devfn, pdev); > AMD_IOMMU_DEBUG("Re-assign %04x:%02x:%02x.%u from dom%d to > dom%d\n", > pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn), > source->domain_id, target->domain_id); > @@ -519,8 +531,7 @@ static int amd_iommu_add_device(u8 devfn > spin_unlock_irqrestore(&iommu->lock, flags); > } > > - amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev); > - return 0; > + return amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, > pdev); > } > > static int amd_iommu_remove_device(u8 devfn, struct pci_dev *pdev) > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -31,9 +31,24 @@ bool_t __initdata iommu_enable = 1; > bool_t __read_mostly iommu_enabled; > bool_t __read_mostly force_iommu; > bool_t __read_mostly iommu_verbose; > -bool __read_mostly iommu_quarantine = true; > bool_t __read_mostly iommu_crash_disable; > > +#define IOMMU_quarantine_none 0 /* aka false */ > +#define IOMMU_quarantine_fault 1 /* aka true */ > +#define IOMMU_quarantine_write_fault 2 > +#ifdef CONFIG_HAS_PCI > +uint8_t __read_mostly iommu_quarantine = > +# if defined(CONFIG_IOMMU_QUARANTINE_NONE) > + IOMMU_quarantine_none; > +# elif defined(CONFIG_IOMMU_QUARANTINE_BASIC) > + IOMMU_quarantine_fault; > +# elif defined(CONFIG_IOMMU_QUARANTINE_FULL) > + IOMMU_quarantine_write_fault; > +# endif > +#else > +# define iommu_quarantine IOMMU_quarantine_none > +#endif /* CONFIG_HAS_PCI */ > + > static bool __hwdom_initdata iommu_hwdom_none; > bool __hwdom_initdata iommu_hwdom_strict; > bool __read_mostly iommu_hwdom_passthrough; > @@ -68,8 +83,12 @@ static int __init parse_iommu_param(cons > else if ( (val = parse_boolean("force", s, ss)) >= 0 || > (val = parse_boolean("required", s, ss)) >= 0 ) > force_iommu = val; > +#ifdef CONFIG_HAS_PCI > else if ( (val = parse_boolean("quarantine", s, ss)) >= 0 ) > iommu_quarantine = val; > + else if ( ss == s + 15 && !strncmp(s, "quarantine=full", 15) ) > + iommu_quarantine = IOMMU_quarantine_write_fault; > +#endif > #ifdef CONFIG_X86 > else if ( (val = parse_boolean("igfx", s, ss)) >= 0 ) > iommu_igfx = val; > @@ -448,7 +467,7 @@ static int __init iommu_quarantine_init( > dom_io->options |= XEN_DOMCTL_CDF_iommu; > > rc = iommu_domain_init(dom_io, 0); > - if ( rc ) > + if ( rc || iommu_quarantine < IOMMU_quarantine_write_fault ) > return rc; > > if ( !hd->platform_ops->quarantine_init ) > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -41,6 +41,9 @@ > #include "vtd.h" > #include "../ats.h" > > +/* dom_io is used as a sentinel for quarantined devices */ > +#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)- > >arch.pgd_maddr) > + > struct mapped_rmrr { > struct list_head list; > u64 base, end; > @@ -1294,6 +1297,9 @@ int domain_context_mapping_one( > int agaw, rc, ret; > bool_t flush_dev_iotlb; > > + if ( QUARANTINE_SKIP(domain) ) > + return 0; > + > ASSERT(pcidevs_locked()); > spin_lock(&iommu->lock); > maddr = bus_to_context_maddr(iommu, bus); > @@ -1548,6 +1554,9 @@ int domain_context_unmap_one( > int iommu_domid, rc, ret; > bool_t flush_dev_iotlb; > > + if ( QUARANTINE_SKIP(domain) ) > + return 0; > + > ASSERT(pcidevs_locked()); > spin_lock(&iommu->lock); > > @@ -1609,7 +1618,7 @@ static int domain_context_unmap(struct d > { > struct acpi_drhd_unit *drhd; > struct vtd_iommu *iommu; > - int ret = 0; > + int ret; > u8 seg = pdev->seg, bus = pdev->bus, tmp_bus, tmp_devfn, secbus; > int found = 0; > > @@ -1625,14 +1634,12 @@ static int domain_context_unmap(struct d > printk(VTDPREFIX "d%d:Hostbridge: skip %04x:%02x:%02x.%u > unmap\n", > domain->domain_id, seg, bus, > PCI_SLOT(devfn), PCI_FUNC(devfn)); > - if ( !is_hardware_domain(domain) ) > - return -EPERM; > - goto out; > + return is_hardware_domain(domain) ? 0 : -EPERM; > > case DEV_TYPE_PCIe_BRIDGE: > case DEV_TYPE_PCIe2PCI_BRIDGE: > case DEV_TYPE_LEGACY_PCI_BRIDGE: > - goto out; > + return 0; > > case DEV_TYPE_PCIe_ENDPOINT: > if ( iommu_debug ) > @@ -1676,10 +1683,12 @@ static int domain_context_unmap(struct d > dprintk(XENLOG_ERR VTDPREFIX, > "d%d:unknown(%u): %04x:%02x:%02x.%u\n", > domain->domain_id, pdev->type, > seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); > - ret = -EINVAL; > - goto out; > + return -EINVAL; > } > > + if ( QUARANTINE_SKIP(domain) ) > + return ret; > + > /* > * if no other devices under the same iommu owned by this domain, > * clear iommu in iommu_bitmap and clear domain_id in domid_bitmp > @@ -1705,16 +1714,12 @@ static int domain_context_unmap(struct d > > iommu_domid = domain_iommu_domid(domain, iommu); > if ( iommu_domid == -1 ) > - { > - ret = -EINVAL; > - goto out; > - } > + return -EINVAL; > > clear_bit(iommu_domid, iommu->domid_bitmap); > iommu->domid_map[iommu_domid] = 0; > } > > -out: > return ret; > } > > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -53,7 +53,9 @@ static inline bool_t dfn_eq(dfn_t x, dfn > } > > extern bool_t iommu_enable, iommu_enabled; > -extern bool force_iommu, iommu_quarantine, iommu_verbose; > +extern bool force_iommu, iommu_verbose; > +/* Boolean except for the specific purposes of > drivers/passthrough/iommu.c. */ > +extern uint8_t iommu_quarantine; > > #ifdef CONFIG_X86 > extern enum __packed iommu_intremap { _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |