[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] IOMMU: make DMA containment of quarantined devices optional



> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of
> Jürgen Groß
> Sent: 13 December 2019 13:47
> To: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>; Stefano Stabellini
> <sstabellini@xxxxxxxxxx>; Julien Grall <julien@xxxxxxx>; Wei Liu
> <wl@xxxxxxx>; Konrad Wilk <konrad.wilk@xxxxxxxxxx>; George Dunlap
> <George.Dunlap@xxxxxxxxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>;
> Paul Durrant <paul@xxxxxxx>; Ian Jackson <ian.jackson@xxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx; Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Subject: Re: [Xen-devel] [PATCH v2] IOMMU: make DMA containment of
> quarantined devices optional
> 
> On 13.12.19 14:38, Jan Beulich wrote:
> > On 13.12.2019 14:31, Jürgen Groß wrote:
> >> On 13.12.19 14:21, Jan Beulich wrote:
> >>> On 13.12.2019 14:11, Jürgen Groß wrote:
> >>>> On 13.12.19 13:53, Jan Beulich wrote:
> >>>>> Containing still in flight DMA was introduced to work around certain
> >>>>> devices / systems hanging hard upon hitting an IOMMU fault. Passing
> >>>>> through (such) devices (on such systems) is inherently insecure (as
> >>>>> guests could easily arrange for IOMMU faults to occur). Defaulting
> to
> >>>>> a mode where admins may not even become aware of issues with devices
> can
> >>>>> be considered undesirable. Therefore convert this mode of operation
> to
> >>>>> an optional one, not one enabled by default.
> >>>>>
> >>>>> This involves resurrecting code commit ea38867831da ("x86 / iommu:
> set
> >>>>> up a scratch page in the quarantine domain") did remove, in a
> slightly
> >>>>> extended and abstracted fashion. Here, instead of reintroducing a
> pretty
> >>>>> pointless use of "goto" in domain_context_unmap(), and instead of
> making
> >>>>> the function (at least temporarily) inconsistent, take the
> opportunity
> >>>>> and replace the other similarly pointless "goto" as well.
> >>>>>
> >>>>> In order to key the re-instated bypasses off of there (not) being a
> root
> >>>>> page table this further requires moving the
> allocate_domain_resources()
> >>>>> invocation from reassign_device() to amd_iommu_setup_domain_device()
> (or
> >>>>> else reassign_device() would allocate a root page table anyway);
> this is
> >>>>> benign to the second caller of the latter function.
> >>>>>
> >>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> >>>>> ---
> >>>>> As far as 4.13 is concerned, I guess if we can't come to an
> agreement
> >>>>> here, the only other option is to revert ea38867831da from the
> branch,
> >>>>> for having been committed prematurely (I'm not so much worried about
> the
> >>>>> master branch, where we have ample time until 4.14). What I surely
> want
> >>>>> to see us avoid is a back and forth in behavior of released
> versions.
> >>>>> (Note that 4.12.2 is similarly blocked on a decision either way
> here.)
> >>>>
> >>>> I'm not really sure we really need to revert ea38867831da before the
> >>>> 4.13 release. It might not be optimal, but I'm quite sure the number
> of
> >>>> cases where this could be an issue is rather small already, and I
> tend
> >>>> to agree with Paul that admins who really care will more likely want
> to
> >>>> select the option where the system will "just work". IMO the
> "noticeable
> >>>> failure" is something which will be selected mostly by developers.
> But
> >>>> I'm not an expert in that area, so I don't want to influence the
> >>>> decision regarding the to be selected default too much.
> >>>
> >>> An admin not wanting to know is, to me, the same as them not wanting
> >>> to know about security issues, and hence not subscribing to our
> >>> announcements lists. I can accept this being a reasonable thing to
> >>> do when it is an _informed_ decision. But with the current
> >>> arrangements there's no way whatsoever for an admin to know.
> >>
> >> Maybe I have misunderstood the current state, but I thought that it
> >> would just silently hide quirky devices without imposing a security
> >> risk. We would not learn which devices are quirky, but OTOH I doubt
> >> we'd get many reports about those in case your patch goes in.
> >
> > We don't want or need such reports, that's not the point. The
> > security risk comes from the quirkiness of the devices - admins
> > may wrongly think all is well and expose quirky devices to not
> > sufficiently trusted guests. (I say this fully realizing that
> > exposing devices to untrusted guests is almost always a certain
> > level of risk.)
> 
> Do we _know_ those devices are problematic from security standpoint?
> Normally the IOMMU should do the isolation just fine. If it doesn't
> then its not the quirky device which is problematic, but the IOMMU.
> 
> I thought the problem was that the quirky devices would not stop all
> (read) DMA even when being unassigned from the guest resulting in
> fatal IOMMU faults. The dummy page should stop those faults to happen
> resulting in a more stable system.

That's right.

> 
> So what are the security problems which are added by this behavior?
> 

Since *not* having the 'sink' page allows a guest pull off a host DoS in the 
presence of such h/w, security is surely increased by having it?

  Paul

> 
> Juergen
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.