[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Thunderbolt (and other PCI hotplug) isolation



On Tue, Jan 16, 2024 at 10:58:05AM +0100, Jan Beulich wrote:
> On 16.01.2024 03:20, Marek Marczykowski-Górecki wrote:
> > Hi,
> > 
> > A little background:
> > In Qubes OS we try to isolate external (especially hot pluggable)
> > devices as much as possible. For PCI devices, we do PCI passthrough to
> > dedicated domains (sys-net, sys-usb - mostly the latter). The goal is to
> > prevent unauthorized device to compromise the whole system, especially
> > using DMA (either initiated by a malicious device itself, or by a
> > compromised driver). For the discussion here, lets ignore what happens
> > before Xen starts.
> > 
> > The matter becomes much more complicated for hot plugged devices. I did
> > some test recently, enabled PCI hoplug in dom0 kernel (we have it
> > disabled by default), and this is what I got:
> > 1. Hot plugged devices were properly detected, and dom0 told Xen about
> > them. In my case, it was two PCI bridges and an NVMe disk.
> > 2. New devices were assigned to dom0 automatically.
> > 3. New leaf device (the disk) can be assigned to a HVM domU and seems to 
> > work.
> > 4. The bridges cannot be assigned to a domU.
> > 
> > Now, there are (at least) two problems with the above:
> > i) The second point above: new device automatically gain ability to DMA (at
> > least) into dom0 memory. I guess this should be easy-ish solvable for
> > leaf devices by assigning them to a quarantine domain by default. There
> > is an issue how to decide what devices to handle this way (for example,
> > what about external devices present during Xen/dom0 startup already),
> > but it feels like a problem solvable with some configuration. And of
> > course dom0 will need to be adjusted to not talk to such devices
> > automatically (via drivers blacklisting or similar approach). But for
> > the bridge devices, it's more complicated, basically the second point
> > below.
> > 
> > ii) The fourth point above: an external PCI device remains in dom0
> > (including being able to dom0 into dom0's memory) just because it happen
> > to have some specific bits in its config space set. When considering
> > malicious device, it doesn't even need to function as a bridge - it's
> > just enough to present itself as a bridge, wait for dom0's thunderbolt
> > driver to authorize the device so it gets assigned dom0's IOMMU context,
> > and boom. On the other hand, a bridge has privileged function by
> > design, for example IIUC takes part in discovering devices behind it
> > (which then needs to be properly registered in Xen, assigned IOMMU
> > context etc).
> 
> I may not be following the underlying concept here: If you consider a
> device potentially malicious, why would you even connect it to your
> system? 

The thing is I don't know whether I can trust the device or not. It may
be a device I got from somebody, but also it may be a benign but buggy
device that later got its firmware compromised. Take the example of an
external nvme disk - I got some data on it to process, and the data
processing itself can easily be isolated in a separate domU (so, I don't
need to assume too much trust in that data), but also I'd like to not
assume trust in the disk itself.

Besides that, there is also an attack where somebody plugs in a device
without my knowledge (when working in open space, some conference, etc).
I'd like to avoid situation where screenlocker can be very quickly bypassed
with a "screen unlock via DMA" thunderbolt device.

> And if you mean Dom0 to not drive devices, why would you even
> build the respective drivers for such a Dom0 kernel?

I could exclude a lot of drivers indeed, but there are practical issues
with that: for example I do want to use the _internal_ nvme disk in
dom0.

> > iii) Untested, but it feels like there is a lot of room for various race
> > conditions in the hot plug handling. For example, device must be
> > allowed any DMA only after its IOMMU context is properly configured.
> 
> Isn't that the case already? Any attempt to DMA without respective
> device / context table entry (AMD / Intel terminology) ought to result
> in IOMMU faults.

Ok, so in theory it should be matter of "just" assigning device to
appropriate domain directly, instead of assigning to dom0 first and only
later re-assigning.

> > I
> > believe thunderbolt technically allows that (plain PCIe hotplug most
> > likely not), but my guess is it's not the case currently.
> > 
> > My question is mostly: what can be done about the "ii" problem above?
> > 
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.