[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional
> -----Original Message----- > From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: 10 March 2020 15:44 > To: paul@xxxxxxx > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; 'Tian, Kevin' <kevin.tian@xxxxxxxxx>; > 'Andrew Cooper' > <andrew.cooper3@xxxxxxxxxx> > Subject: Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of > quarantined devices optional > > On 10.03.2020 16:13, Paul Durrant wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@xxxxxxxx> > >> Sent: 10 March 2020 15:05 > >> To: paul@xxxxxxx > >> Cc: 'Tian, Kevin' <kevin.tian@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx; > >> 'Andrew Cooper' > >> <andrew.cooper3@xxxxxxxxxx> > >> Subject: Re: [PATCH v3] IOMMU: make DMA containment of quarantined devices > >> optional > >> > >> On 10.03.2020 13:30, Paul Durrant wrote: > >>>> -----Original Message----- > >>>> From: Jan Beulich <jbeulich@xxxxxxxx> > >>>> Sent: 10 March 2020 10:27 > >>>> To: Tian, Kevin <kevin.tian@xxxxxxxxx>; Paul Durrant <paul@xxxxxxx> > >>>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Andrew Cooper > >>>> <andrew.cooper3@xxxxxxxxxx> > >>>> Subject: Re: [PATCH v3] IOMMU: make DMA containment of quarantined > >>>> devices optional > >>>> > >>>> On 10.03.2020 04:43, Tian, Kevin wrote: > >>>>>> From: Jan Beulich <jbeulich@xxxxxxxx> > >>>>>> Sent: Monday, March 9, 2020 7:09 PM > >>>>>> > >>>>>> I'm happy to take better suggestions to replace the "full" command line > >>>>>> option and Kconfig prompt tokens. I don't think though that "fault" and > >>>>>> "write-fault" are really suitable there. > >>>>> > >>>>> I think we may just allow both r/w access to scratch page for such bogus > >>>>> device, which may make 'full' more reasonable since we now fully > >>>>> contain in-fly DMAs. I'm not sure about the value of keeping write-fault > >>>>> alone for such devices (just because one observed his specific device > >>>>> only > >>>>> has problem with read-fault). > >>>> > >>>> Well, a fundamental problem I have here is that I still don't know > >>>> the _exact_ conditions for the observed hangs. I consider it unlikely > >>>> for IOMMU read faults to cause hangs, but for write faults to be > >>>> "fine". > >>> > >>> AFAIK it's because the writes are posted and so any faults are just > >>> ignored, whereas a read fault > >> being synchronous causes the device's state machine to lock up. It really > >> is observed behaviour. > >>> > >>>> It would seem more likely to me that e.g. a non-present > >>>> context entry might cause issues. If that was the case, we wouldn't > >>>> need to handle reads and writes differently; we could instead install > >>>> an all zero top level page table. And we'd still get all faults that > >>>> are supposed to surface. But perhaps Paul did try this back then, and > >>>> it turned out to not be an option. > >>>> > >>> > >>> The only info I had was that faults on DMA reads had to avoided > >>> completely. I did not have access to the h/w in question at the > >>> time. I may be able to get it now. > >> > >> I see. The implication then is, as Kevin said, that we mustn't run > >> guests with _any_ IOMMU PTEs having their "read" bits clear. > >> Anything that's "not present" now would need directing to a scratch > >> page. I then further wonder what effect reads to addresses beyond > >> AGAW would have. It may be impossible to arrange for sufficiently > >> secure pass-through with such a device, at which point - again as > >> said by Kevin - there may be little point in the scratch page > >> based quarantining. > >> > > > > Well, I can't say there's little point in it as it does fix a host lock-up. > > > > You say "as Kevin said, that we mustn't run guests with _any_ IOMMU > > PTEs having their "read" bits clear"... I can't find that. I did > > find where he said "In concept any IOMMU page table (dom0, dom_io > > or domU) for such bogus device should not include invalid entry", > > but that's a different thing. > > In which way? In that the PTE would still be a valid entry? It would have read perm clear, yes, but that doesn't make the PTE invalid. > > > However, is a really saying that things will break if any of the > > PTEs has their present bit clear? > > Well, you said that read faults are fatal (to the host). Reads will, > for any address with an unpopulated PTE, result in a fault and hence > by implication be fatal. Oh I see. I thought there was an implication that the IOMMU could not cope with non-present PTEs in some way. Agreed that, when the device is assigned to the guest, then it can arrange (via ballooning) for a non-present entry to be hit by a read transaction, resulting in a lock-up. But dealing with a malicious guest was not the issue at hand... dealing with a buggy device that still tried to DMA after reset and whilst in quarantine was the problem. > > (As an aside, other than in x86's CPU page tables, IOMMU page tables > in both their AMD and Intel incarnations don't have "present" bits - > they have "read" and "write" ones only.) > Oh, ok. I must have misunderstood the purpose of the 'pr' bit in the AMD IOMMU PTE then. Agreed that VT-d does not seem to have an explicit present bit (now that I look at the implementation of dma_pte_present()). Paul > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |