[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Stratos-dev] Xen Rust VirtIO demos work breakdown for Project Stratos
On Sat, 2 Oct 2021, Oleksandr Tyshchenko wrote: > On Sat, Oct 2, 2021 at 2:58 AM Stefano Stabellini <sstabellini@xxxxxxxxxx> > wrote: > > Hi Stefano, all > > [Sorry for the possible format issues] > [I have CCed Julien] > > > On Tue, 28 Sep 2021, Oleksandr Tyshchenko wrote: > > On Tue, Sep 28, 2021 at 9:26 AM Stefano Stabellini > <sstabellini@xxxxxxxxxx> wrote: > > > > Hi Stefano, all > > > > [Sorry for the possible format issues] > > > > > > On Mon, 27 Sep 2021, Christopher Clark wrote: > > > On Mon, Sep 27, 2021 at 3:06 AM Alex Bennée via Stratos-dev > <stratos-dev@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > Marek Marczykowski-Górecki > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> writes: > > > > > > > [[PGP Signed Part:Undecided]] > > > > On Fri, Sep 24, 2021 at 05:02:46PM +0100, Alex Bennée > wrote: > > > >> Hi, > > > > > > > > Hi, > > > > > > > >> 2.1 Stable ABI for foreignmemory mapping to non-dom0 > ([STR-57]) > > > >> > ─────────────────────────────────────────────────────────────── > > > >> > > > >> Currently the foreign memory mapping support only > works for dom0 due > > > >> to reference counting issues. If we are to support > backends running in > > > >> their own domains this will need to get fixed. > > > >> > > > >> Estimate: 8w > > > >> > > > >> > > > >> [STR-57] <https://linaro.atlassian.net/browse/STR-57> > > > > > > > > I'm pretty sure it was discussed before, but I can't > find relevant > > > > (part of) thread right now: does your model assumes > the backend (running > > > > outside of dom0) will gain ability to map (or access > in other way) > > > > _arbitrary_ memory page of a frontend domain? Or > worse: any domain? > > > > > > The aim is for some DomU's to host backends for other > DomU's instead of > > > all backends being in Dom0. Those backend DomU's would > have to be > > > considered trusted because as you say the default > memory model of VirtIO > > > is to have full access to the frontend domains memory > map. > > > > > > > > > I share Marek's concern. I believe that there are Xen-based > systems that will want to run guests using VirtIO devices > without > > extending > > > this level of trust to the backend domains. > > > > >From a safety perspective, it would be challenging to deploy a > system > > with privileged backends. From a safety perspective, it would > be a lot > > easier if the backend were unprivileged. > > > > This is one of those times where safety and security > requirements are > > actually aligned. > > > > > > Well, the foreign memory mapping has one advantage in the context of > Virtio use-case > > which is that Virtio infrastructure in Guest doesn't require any > modifications to run on top Xen. > > The only issue with foreign memory here is that Guest memory actually > mapped without its agreement > > which doesn't perfectly fit into the security model. (although there > is one more issue with XSA-300, > > but I think it will go away sooner or later, at least there are some > attempts to eliminate it). > > While the ability to map any part of Guest memory is not an issue for > the backend running in Dom0 > > (which we usually trust), this will certainly violate Xen security > model if we want to run it in other > > domain, so I completely agree with the existing concern. > > Yep, that's what I was referring to. > > > > It was discussed before [1], but I couldn't find any decisions > regarding that. As I understand, > > the one of the possible ideas is to have some entity in Xen (PV > IOMMU/virtio-iommu/whatever) > > that works in protection mode, so it denies all foreign mapping > requests from the backend running in DomU > > by default and only allows requests with mapping which were > *implicitly* granted by the Guest before. > > For example, Xen could be informed which MMIOs hold the queue PFN and > notify registers > > (as it traps the accesses to these registers anyway) and could > theoretically parse the frontend request > > and retrieve descriptors to make a decision which GFNs are actually > *allowed*. > > > > I can't say for sure (sorry not familiar enough with the topic), but > implementing the virtio-iommu device > > in Xen we could probably avoid Guest modifications at all. Of course, > for this to work > > the Virtio infrastructure in Guest should use DMA API as mentioned in > [1]. > > > > Would the “restricted foreign mapping” solution retain the Xen > security model and be accepted > > by the Xen community? I wonder, has someone already looked in this > direction, are there any > > pitfalls here or is this even feasible? > > > > [1] > https://lore.kernel.org/xen-devel/464e91ec-2b53-2338-43c7-a018087fc7f6@xxxxxxx/ > > The discussion that went further is actually one based on the idea that > there is a pre-shared memory area and the frontend always passes > addresses from it. For ease of implementation, the pre-shared area is > the virtqueue itself so this approach has been called "fat virtqueue". > But it requires guest modifications and it probably results in > additional memory copies. > > > I got it. Although we would need to map that pre-shared area anyway (I > presume it could be done at once during initialization), I think it > much better than > map arbitrary pages at runtime. Yeah that's the idea > If there is a way for Xen to know the pre-shared area location in advance it > will be able to allow mapping > this region only and deny other attempts. No, but there are patches (not yet upstream) to introduce a way to pre-share memory regions between VMs using xl: https://github.com/Xilinx/xen/commits/xilinx/release-2021.1?after=4bd2da58b5b008f77429007a307b658db9c0f636+104&branch=xilinx%2Frelease-2021.1 So I think it would probably be the other way around: xen/libxl advertises on device tree (or ACPI) the presence of the pre-shared regions to both domains. Then frontend and backend would start using it. > I am not sure if the approach you mentioned could be implemented > completely without frontend changes. It looks like Xen would have to > learn how to inspect virtqueues in order to verify implicit grants > without frontend changes. > > > I looked through the virtio-iommu specification and corresponding Linux > driver but I am sure I don't see all the challenges and pitfalls. > Having a limited knowledge of IOMMU infrastructure in Linux, below is just my > guess, which might be wrong. > > 1. I think, if we want to avoid frontend changes the backend in Xen would > need to fully conform to the specification, I am afraid that > besides just inspecting virtqueues, the backend needs to properly and > completely emulate the virtio device, handle shadow page tables, etc. > Otherwise we might break the guest. I expect a huge amount of work to > implement this properly. Yeah, I think we would want to stay away from shadow pagetables unless we are really forced to go there. > 2. Also, if I got the things correctly, it looks like when enabling > virtio-iommu, all addresses passed in requests to the virtio devices > behind the virtio-iommu will be in guest virtual address space (IOVA). So we > would need to find a way for userspace (if the backend is > IOREQ server) to translate them to guest physical addresses (IPA) via these > shadow page tables in the backend in front of mapping them via > foreign memory map calls. So I expect Xen, toolstack and Linux privcmd driver > changes and additional complexity taking into account how the > data structures could be accessed (data structures being continuously in > IOVA, could be discontinuous in IPA, indirect table descriptors, > etc). > I am wondering, would it be possible to have identity IOMMU mapping (IOVA == > GPA) at the guest side but without bypassing an IOMMU, as we > need the virtio-iommu frontend to send map/unmap requests, can we control > this behaviour somehow? > I think this would simplify things. None of the above looks easy. I think you are right that we would need IOVA == GPA to make the implementation feasible and with decent performance. But if we need a spec change, then I think Juergen's proposal of introducing a new transport that uses grant table references instead of GPAs is worth considering. > 3. Also, we would probably want to have a single virtio-iommu device instance > per guest, so all virtio devices which belong to this guest > will share the IOMMU mapping for the optimization purposes. For this to work > all virtio devices inside a guest should be attached to the > same IOMMU domain. Probably, we could control that, but I am not 100% sure.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |