[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional



> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Tuesday, March 10, 2020 6:27 PM
> 
> On 10.03.2020 04:43, Tian, Kevin wrote:
> >> From: Jan Beulich <jbeulich@xxxxxxxx>
> >> Sent: Monday, March 9, 2020 7:09 PM
> >>
> >> I'm happy to take better suggestions to replace the "full" command line
> >> option and Kconfig prompt tokens. I don't think though that "fault" and
> >> "write-fault" are really suitable there.
> >
> > I think we may just allow both r/w access to scratch page for such bogus
> > device, which may make 'full' more reasonable since we now fully
> > contain in-fly DMAs. I'm not sure about the value of keeping write-fault
> > alone for such devices (just because one observed his specific device only
> > has problem with read-fault).
> 
> Well, a fundamental problem I have here is that I still don't know
> the _exact_ conditions for the observed hangs. I consider it unlikely
> for IOMMU read faults to cause hangs, but for write faults to be
> "fine". It would seem more likely to me that e.g. a non-present
> context entry might cause issues. If that was the case, we wouldn't
> need to handle reads and writes differently; we could instead install
> an all zero top level page table. And we'd still get all faults that
> are supposed to surface. But perhaps Paul did try this back then, and
> it turned out to not be an option.
> 
> The choice of letting writes continue to fault was based on (a) this
> having been tested to work on the affected system(s) and (b) also
> letting writes go to a scratch page requiring a per-device scratch
> page (and associated page tables) rather than a system-wide one, as
> devices coming from different domains would otherwise be able to
> observe data written to memory by respectively "foreign" devices
> (and hence domains).

ok, it is a valid point.

> 
> But this is all guesswork without the firmware writers of affected
> systems giving us at least some hints.
> 
> > alternatively I also thought about whether whitelisting the problematic
> > devices through another option (e.g. nofault=b:d:f) could provide more
> > value. In concept any IOMMU page table (dom0, dom_io or domU)
> > for such bogus device should not include invalid entry, even when
> > quarantine is not specified. However I'm not sure whether it's worthy of
> > going so far...
> 
> Indeed. Question though is whether this bad behavior is device specific
> (rather than e.g. system dependent). Plus - as per above - question
> also is whether it's really leaf (or intermediate) page table entry
> presence which actually matters here. If it was, I agree we shouldn't
> have any non-present entries anywhere in the page table trees.
> 
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.