[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] IOMMU: make DMA containment of quarantined devices optional



On Fri, Dec 13, 2019 at 01:53:29PM +0100, Jan Beulich wrote:
> Containing still in flight DMA was introduced to work around certain
> devices / systems hanging hard upon hitting an IOMMU fault. Passing
> through (such) devices (on such systems) is inherently insecure (as
> guests could easily arrange for IOMMU faults to occur). Defaulting to
> a mode where admins may not even become aware of issues with devices can
> be considered undesirable. Therefore convert this mode of operation to
> an optional one, not one enabled by default.
> 
> This involves resurrecting code commit ea38867831da ("x86 / iommu: set
> up a scratch page in the quarantine domain") did remove, in a slightly
> extended and abstracted fashion. Here, instead of reintroducing a pretty
> pointless use of "goto" in domain_context_unmap(), and instead of making
> the function (at least temporarily) inconsistent, take the opportunity
> and replace the other similarly pointless "goto" as well.
> 
> In order to key the re-instated bypasses off of there (not) being a root
> page table this further requires moving the allocate_domain_resources()
> invocation from reassign_device() to amd_iommu_setup_domain_device() (or
> else reassign_device() would allocate a root page table anyway); this is
> benign to the second caller of the latter function.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

I'm however not sure if the default quarantine mode should be the
basic or the full one.

At the end of day the full one does prevent hard hangs on specific
systems, but a guest with a device behind such bogus IOMMU can
trivially trigger those anyway.

> ---
> As far as 4.13 is concerned, I guess if we can't come to an agreement
> here, the only other option is to revert ea38867831da from the branch,
> for having been committed prematurely (I'm not so much worried about the
> master branch, where we have ample time until 4.14). What I surely want
> to see us avoid is a back and forth in behavior of released versions.
> (Note that 4.12.2 is similarly blocked on a decision either way here.)
> 
> I'm happy to take better suggestions to replace "full".

I was going to comment on v1, but I really have no better alternative.

> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -30,13 +30,17 @@ bool_t __initdata iommu_enable = 1;
>  bool_t __read_mostly iommu_enabled;
>  bool_t __read_mostly force_iommu;
>  bool_t __read_mostly iommu_verbose;
> -bool __read_mostly iommu_quarantine = true;
>  bool_t __read_mostly iommu_igfx = 1;
>  bool_t __read_mostly iommu_snoop = 1;
>  bool_t __read_mostly iommu_qinval = 1;
>  bool_t __read_mostly iommu_intremap = 1;
>  bool_t __read_mostly iommu_crash_disable;
>  
> +#define IOMMU_quarantine_none  0
> +#define IOMMU_quarantine_basic 1
> +#define IOMMU_quarantine_full  2
> +uint8_t __read_mostly iommu_quarantine = IOMMU_quarantine_basic;

I wonder whether the default should be to use the sink page. Not using
it can lead to a hard hang on certain hardware according to the
description. OTOH if such devices are actually passed through, the
guest itself can trigger such page faults and hence freeze the system.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.