Xen project Mailing List

Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator

To: Julien Grall <julien.grall@xxxxxxxxxx>, jens.wiklander@xxxxxxxxxx

From: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>

Date: Tue, 24 Oct 2017 22:02:28 +0300

Cc: Julien Grall <julien.grall@xxxxxxx>, nd@xxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Tue, 24 Oct 2017 19:02:49 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Spamdiagnosticmetadata: NSPM

Spamdiagnosticoutput: 1:99

Hi Julien, Jens, I'm looped in Jens Wiklander. He is one of OP-TEE maintainers. He also maintains TEE subsytem in Linux kernel. I CC'ed him in 4/4 patch, because only it concerned OP-TEE. But looks like discussion in this thread revolves primarily over OP-TEE, so I'm adding him there. Jens, if you want to catch up, you can find whole thread at [1]. On Tue, Oct 24, 2017 at 06:33:20PM +0100, Julien Grall wrote: > Hi, > > On 23/10/17 21:11, Volodymyr Babchuk wrote: > >On Mon, Oct 23, 2017 at 05:59:44PM +0100, Julien Grall wrote: > > > >>Hi Volodymyr, > >Hi Julien, > > > >>Let me begin the e-mail with I am not totally adversed to putting the TEE > >>mediator in Xen. At the moment, I am trying to understand the whole picture. > >Thanks for clarification. This is really reassuring :) > >In my turn, I'm not totally against TEE mediators in stubdoms. I'm only > >concerned about required efforts. > > > >>On 20/10/17 18:37, Volodymyr Babchuk wrote: > >>>On Fri, Oct 20, 2017 at 02:11:14PM +0100, Julien Grall wrote: > >>>>On 17/10/17 16:59, Volodymyr Babchuk wrote: > >>>>>On Mon, Oct 16, 2017 at 01:00:21PM +0100, Julien Grall wrote: > >>>>>>On 11/10/17 20:01, Volodymyr Babchuk wrote: > >>>>>>>I want to present TEE mediator, that was discussed earlier ([1]). > >>>>>>> > >>>>>>>I selected design with built-in mediators. This is easiest way, > >>>>>>>it removes many questions, it is easy to implement and maintain > >>>>>>>(at least I hope so). > >>>>>> > >>>>>>Well, it may close the technical questions but still leave the security > >>>>>>impact unanswered. I would have appreciated a summary of each approach > >>>>>>and > >>>>>>explain the pros/cons. > >>>>>This is the most secure way also. In terms of trust between guests and > >>>>>Xen at least. I'm worked with OP-TEE guys mostly, so when I hear about > >>>>>"security", my first thoughts are "Can TEE OS trust to XEN as a > >>>>>mediator? Can TEE client trust to XEN as a mediator?". And with > >>>>>current approach answer is "yes, they can, especially if XEN is a part > >>>>>of a chain of trust". > >>>>> > >>>>>But you probably wanted to ask "Can guest compromise whole system by > >>>>>using TEE mediator or TEE OS?". This is an interesting question. > >>>>>First let's discuss requirements for a TEE mediator. So, mediator > >>>>>should be able to: > >>>>> > >>>>> * Receive request to handle trapped SMC. This request should include > >>>>> user registers + some information about guest (at least domain id). > >>>>> * Pin/unpin domain memory pages. > >>>>> * Map domain memory pages into own address space with RW access. > >>>>> * Issue real SMC to a TEE. > >>>>> * Receive information about guest creation and destruction. > >>>>> * (Probably) inject IRQs into a domain (this can be not a requester > >>>>> domain, > >>>>> but some other domain, that also called to TEE). > >>>>> > >>>>>This is a minimal list of requirements. I think, this should be enough to > >>>>>implement mediator for OP-TEE. But I can't say for sure for other TEEs. > >>>>> > >>>>>Let's consider possible approaches: > >>>>> > >>>>>1. Mediator right in XEN, works at EL2. > >>>>> Pros: > >>>>> * Mediator can use all XEN APIs > >>>>> * As mediator resides in XEN, it can be checked together with XEN > >>>>> for a validity (trusted boot). > >>>>> * Mediator is initialized before Dom0. Dom0 can work with a TEE. > >>>>> * No extra context switches, no special ABI between XEN and > >>>>> mediator. > >>>>> > >>>>> Cons: > >>>>> * Because it lives in EL2, it can compromise whole hypervisor, > >>>>> if there is a security bug in mediator code. > >>>>> * No support for closed source TEEs. > >>>> > >>>>Another cons is you assume TEE API is fully stable and will not change. > >>>>Imagine a new function is added, or a vendor decided to hence with a new > >>>>set > >>>>of API. How will you know Xen is safe to use it? > >>>With whitelisting, as you correctly suggested below. XEN will process > >>>only know requests. Anything that looks unfimiliar should be rejected. > >> > >>Let's imagine the guest is running on a platform with a newer version of > >>TEE. This guest will probe the version of OP-TEE and knows the new function > >>is present. > >This request will be handled mediator. At this moment, OP-TEE client does > >not use versions. Instead it uses capability flags. So, mediator should > >filter all unknown caps. This will force guest to use only supported > >subset of features. > > One more question. Does it mean new functions will never be added in current > capabilities? AFAIK, now. That would break backward compatibility. > >If, in the future, client will relly on versions (i.e. due to dramatic > >protocol change), mediator can either downgrade version or refuse to work > >at all. > > Makes sense. > > > > >>If as you said Xen is using a whitelist, this means the hypervisor will > >>return unimplemented. > >>How do you expect the guest to behave in that case? > >As I said above, guest should downgrade to supported features subset. > > > >>Note that I think a whitelist is a good idea, but I think we need to think a > >>bit more about the implication. > >At least now OP-TEE is designed in a such way, that it is compatible in both > >ways. I'm sure that future OP-TEE development will be done with > >virtualization > >support in mind, so it will not break existing setups. > > It would be good to have the two communities talking together. So we can > make sure the virtualization support is not going in the wrong direction. > > Similarly, it would be nice that someone from the OP-TEE maintainers give > feedback on the approach suggested in Xen. Yep. I added Jens, as I said above. > > > >>> > >>>>If it is not safe, this means you have a whitelist solution and therefore > >>>>tie Xen to a specific OP-TEE version. So if you need to use a new function > >>>>you would need to upgrade Xen making the code of using new version > >>>>potentially high. > >>>Yes, any ABI change between OP-TEE and its clients will require mediator > >>>upgrade. Luckilly, OP-TEE maintains ABI backward-compatible, so if you'll > >>>install old XEN and new OP-TEE, OP-TEE will use only that subset of ABI, > >>>which is known to XEN. > >>> > >>>>Also, correct me if I am wrong, OP-TEE is a BSD 2-Clause. This means you > >>>>impose anyone wanted to modify OP-TEE for their own purpose can make a > >>>>closed version of the TEE. But if you need to introspect/whitelist call, > >>>>you > >>>>impose the vendor to expose their API. > >>>Basically yes. Is this bad? OP-TEE driver in Linux is licensed under GPL > >>>v2. > >>>If vendor modifies interface between OP-TEE and Linux, they anyways > >>>obligued > >>>to expose API. > >> > >>Pardon me for potential stupid questions, my knowledge of OP-TEE is limited. > >> > >>My understanding is the OP-TEE will provide a generic way to access > >>different Trusted Application. While OP-TEE API may be generic, the TA API > >>is custom. AFAICT the latter is not part of Linux driver. > >Yes, you are perfectly right there. > > > >>So here my questions: > >> 1) Are you planning allow all the guests to access every Trusted > >>Applications? > >This is a good question. There are two types of TAs supported in > >OP-TEE: real TAs (as they are described in GlobalPlatform specs) and > >PseudoTAs. The latter ones are statically linked right into OP-TEE > >kernel and execute at S-EL1 level. > >Real TAs are provided by client. That means that NW userspace > >supplicant loads TA into OP-TEE. OP-TEE checks signature for the TA > >and then runs it in S-EL0. > >So, I'm planning to allow client to work with any real TA. I can't see > >real problem there. > > Are the real TAs going to be shared between guests? Or will each guest have > their own one? No, we don't plan this. At least at this momoent. Every guest will have own instance of TA. > Will you allow every guests loading real TAs? Yes, if guest has access to TEE, it can load TA. Either there is no sense to use TEE. OP-TEE core itself does not provide useful services to clients. Just to be sure: client can't execute any code as TA. TA should be signed by product vendor. In real world you can't take TA from product A and run it on product B, even if they are built on the same SoC. Every product vendor should generate own set of keys and install them in OP-TEE. > >PseudoTAs can be used to access some platform-specific features, and thus > >it can be quite dangerous to allow anyone call them. > >But, generic OP-TEE includes only test and benchmark PseudoTAs, that > >should be disabled on production builds. So, I don't see why generic > >mediator should distinguish them. I think, XSM can be employed later > >to control which guest can access which PseudoTA. But this is not > >target for first version. > > I guess the first version will forbid access to PseudoTA from all the guests > but Dom0? Actually no. All TAs are identified by UUIDs. Mediator can distinguish TA from PseudoTA. I have in mind another argument: vanilla OP-TEE does not offer any production PseudoTAs. If platform\product vendor extends OP-TEE with some PTAs, then they should add corresponding filtering functionality in mediator. Because only vendor know what given PTA does and how it can affect system security. > > >> 2) Will you ever need to introspect those messages? > >No, I don't. > > I guess that's because all the TAs should followed the specified message > protocol? Something like that. GlobalPlatform defines API for TAs and clients. Basically this API have primary 4 commands: OpenSession, CloseSession, InvokeCommand, RequestCancelation. Command is identified by integer number. This command number have meaning only for certain TA. So, for example, command number 8 to TA A can mean "generate random number for me", while command number 8 to TA B can mean "make payment". Every TEE implements protocol for this calls is it wishes. This means, that GlobalPlatform does not define how OP-TEE client should send this requests to OP-TEE OS. Mediator will see that client calls InvokeCommain function, but it have no idea what this command will do. And it don't have to. At this level of protocol OP-TEE itself ensures integrity. > > > >>>>> > >>>>>2. Mediator in a stubdomain. Works at EL1. > >>>>> Pros: > >>>>> * Mediator is isolated from hypervisor (but it still can do > >>>>> potentially > >>>>> dangerous things like mapping domain memory or pining pages). > >>>>> * One can legally create and use mediator for a closed-source TEE. > >>>> > >>>> * Easier to upgrade to a new version of OP-TEE. > >>>Yes, this is true. But what about interface between XEN and mediator? > >>>This is a new entity that should be maintained. Will I abe able to use > >>>new XEN with old mediator? Or new mediator with old XEN? > >> > >>Why would you need to specific interface for the mediator? (see more below) > >At least following features in XEN control (I hope this is right term) API > >are missing right now: > > - domain creation/destruction hooks > > - ability to intercept only certain SMCs > > - way to inject IRQs to other guests > > > >Also, see more below > >>> > >>>>>> Cons: > >>>>> * Overhead in XEN<->Mediator communication. > >>>>> * XEN needs to be modified to boot mediator domain before Dom0. > >>>> > >>>>Is it a really cons? In the past, we had discussion to allow Xen creating > >>>>multiple domain, avoiding the overhead of Dom0. This could also benefits > >>>>here. > >>>As I understand, this is a significant change in XEN. What are the chances, > >>>that community will accept this change? As I can see, immediate benefit > >>>of this is only TEE mediator support. Looks like no one except us > >>>interested in this topic. > >> > >>The GSOC project was not added because of TEE mediator. We had companies > >>showing interest to start multiple domains at the same time. This would > >>significantly shrink down the boot time of the whole platform. > >Yes. Actually, we also interested in a faster boot. But my point was > >that what we need for mediator is not the same that is described in > >GSOC project. Functionality described at GSOC page has multiple uses. > >But for mediator we need something more intricate: as I said below, > >ability to delay boot of hwdom (and other domains). > > Not really, you could the domain could block when issuing an SMC until the > mediator is up and running. Do you mean, that if domain tries to execute SMC, and mediator is not ready, then hypervisor should pause all domain's vCPUs? That can be destructive for hw domain. > > > >>> > >>>BTW, I checked "Xen on ARM: create multiple guests from device > >>>tree" at [1]. This is close, to what we need, but not exactly. You see, > >>>TEE mediator should be created *before* Dom0. So actually TEE mediator > >>>will receive domid 0. I suspect that this only change will break > >>>many things. > >> > >>Can you please give example? > >I'm sure that I seen checks for domid == 0 before, but now I can't find any. > >Probably, that was closed-source backends. So, sorry for false accusation :) > > > >>Technically none of the hypervisor, Linux and the toolstack should rely on > >>dom0 to be domid 0. > >> > >>AFAIK, the hypervisor and Linux are free of them. It might be possible to > >>have few hardcoded in the toolstack, but they should really disappear. > >Totaly agree there. > > > >>However, I can't see why you require the mediator to use domid 0. You could > >>for example keep the hardware domain paused until the mediator has started. > >So this will like: construct dom0, construct and run mediator domain, > >run dom0 by signal from DomMediator? Probably this will work. > > > >>> > >>>>> > >>>>>And yes, it seems obvious, but I want to say this explicitly: generic > >>>>>TEE mediator framework should and will use XSM to control which domain > >>>>>can work with TEE. So, if you don't trust your guest - don't let it > >>>>>to call TEE at all. > >>>> > >>>>Correct me if I am wrong. TEE could be used by Android guest which likely > >>>>run the user apps... right? So are you saying you fully trust that guest > >>>>and > >>>>obviously the user installing rogue app? > >>>I don't think that app downloaded from Play Marget can access OP-TEE > >>>directly. > >>>OP-TEE can be used by Android itself as a key storage or to access to a SE, > >>>for example. But 3rd app that issues TEE calls... I don't think so. > >> > >>You didn't get my point here. That rogue app may be able to break into > >>kernel via an exploit or have enough privilege to break the guest. Who knows > >>what it will be able to do after... > >Only what hypervisor and TEE will allow it to do. Look, OP-TEE was not > >designed > >to rule the machine. There is ARM TF for that :) OP-TEE's task is to provide > >some safer environment for sensitive data and code. This environment has > >well-defined interfaces and is desgined to be as safe as possible. > > > >If rogue app breaks into kernel, then it can issue any SMC which it wants. > >But OP-TEE does not trust to NW. Hypervisor does not trust to guests. > >Mediator should be written in the same way. > > > >So, what can do rogue kernel? As I know - it can cause DoS in OP-TEE. This is > >known issue. If there is a security bug in OP-TEE, it probably can overcome > >whole system. But this is true for any system running OP-TEE. > > I agree that if you take over OP-TEE, you will take over any system. This is > not specific to hypervisor. Yes. But it just occured to me that mediator+OP-TEE *can* be more secure then just OP-TEE. You see, mediator should perform own security checks before forwarding call to OP-TEE. So if OP-TEE misses something, mediator can back it up. I wouldn't rely on this. It just interesting thought :-) > Baremetal OS taking down the platform will only harm itself. A guest OS > could harm the whole platform. Can't argument with that. I think that this feature (shared TEE) is not suitable for, say, VPSes. But it can work just fine on smartphones or on another embedded devices, where vendor defines whole system. > What I am not sure yet, maybe because of my lack of knowledge around OP-TEE, > who is going to protect a TA to access all the NS memory? TAs is runing in S-EL0. It can't control MMU. Before every TA invocation, OP-TEE setups MMU in such way, so TA sees only shared memory arguments passed by client for this particular invocation. > > > >If there is a security flaw in mediator - it can compromise either > >hypervisor, > >or DomMediator and all TEE-capable guests. Yes, this is a risk. > > > >>The whole point of using an hypervisor is to isolate guest from each other. > >>So what is the isolation model with OP-TEE and the mediator? > >OP-TEE is written to isolate TAs, resources and clients from each other. > >Currently there are no plans for interaction between TAs from different VMs, > >no resource sharing, nothing like this. > >What do you mean under "isolation model"? Can you give some example? > > By that I meant, who is going to prevent guest A to access guest B data. I > think you partly answered to my question by the "OP-TEE is written to > isolate TAs". The access to NS memory question above will fill the rest I > think. Yes. Every TA is running in own context, and there are no trust even between TAs. > > > >>> > >>>>>This feature is not implemented in this RFC only because > >>>>>currently only Dom0 calls are supported. > >>>>> > >>>>>>This would help to understand that maybe it is an easy way but also > >>>>>>still > >>>>>>secure... > >>>>>In previous discussion we considered only two variants: in XEN or outside > >>>>>XEN. Stubdomain approach looks more secure, but I'm not sure that it is > >>>>>true. > >>>>>Such stubdomain will need access to all guests memory. If you managed to > >>>>>gain control on mediator stubdomain, you can do anything you want with > >>>>>all > >>>>>guests. > >>>> > >>>>That's slightly untrue. The stubdomain will only be able to mess with > >>>>domains using TEE. > >>>Yes, this is more strict. Then either you are not allowing your privileged > >>>domain to use TEE, or your system may be compromised anyways. > >> > >>Can you give an example of privilege domain for you? Do you consider Android > >>a privilege domain? > >In this case I used term "priviliged domain" in XEN meaning: is_privileged > >== 1. > >Android is not privileged domain, by all means. > >I wanted to say that you if you allow Dom0 to access TEE, then hacked > >DomMediator > >can compromise Dom0 and the hypervisor. > > And I never disagreed in that. This is the non-controversial part :). > > > > >>>>> > >>>>>>To be clear, this series don't look controversial at least for OP-TEE. > >>>>>>What > >>>>>>I am more concerned is about DomU supports. > >>>>>Your concern is that rogue DomU can compromise whole system, right? > >>>> > >>>>Yes. You seem to assume that DomU using TEE will always be trusted, I > >>>>think > >>>>this is the wrong approach if the use is able to interact directly with > >>>>those guests. See above. > >>>No, I am not assuming that DomU that calls TEE should be trusted. Why do > >>>you > >>>think so? It should be able to use TEE services, but this does not mean > >>>that > >>>XEN should trust it. > >> > >>In a previous answer you said: "So, if you don't trust your guest - don't > >>let it". For me, this clearly means you consider that DomU using TEE are > >>trusted. > >> > >>So can you clarify by what you mean by trust then? > >Well... In real world "trust" isn't binary option. You don't want to > >allow all domains to access TEE. Breached TEE user domain doesn't > >automatically mean that your whole system is compromised. But this > >certainly increases attack surface. So it is safer to give TEE access > >only to those domains, which really require it. You can call them > >sligtly more trusted, then others. > > Do you have an example of guest you would slightly trust more? I have an example of guest I would trust less: if I'm running server, and I'm selling virtual machines on that server, I don't want to them to access TEE. I will trust slightly more to my own guest. > > > >>>Even now, XEN processes requests from DomUs without > >>>trusting them. Why do you think, that TEE mediator usage will differ? > >> > >>I guess you are comparing with vGIC and PL011? IHMO, the main difference is > >>Xen is taking care alone of the isolation between guest. Here in the TEE > >>case, you rely on a combination of both TEE and Xen to do the isolation. > >Yes. This is will be less secure, than TEE-only or hypervisor-only system. > > Can you expand here? If TEE has one security flaw and hypervisor has one security flaw, then you have two security flaws in your system. And any of them can compromise whole system. > > > >>> > >>>Look, I generally not against idea of TEE mediator in stubdoms. But this > >>>approach require many changes in existing XEN code: > >>> > >>>1. Load domains before Dom0. > >>> > >>>2. Add special API for mediator. Or alter existing ones. You can't use > >>> existing APIs as it, because you need to enforce stricter XSM rules > >>> on them. > >> > >>Mind giving more explanation....? Xen has a default policy for XSM and > >>indeed may not fit your use case. But you can write your own policy and load > >>it. > >Yes. You need policy "allow this stubdom to map memory only from TEE-enabled > >guests". AFAIK, this is not possible right now. But I can be wrong, I'm > >not very familiar with XSM. > > I believe XSM could do that. IIRC, you can "label" your domain and use that > to say "stubdom is allowed to access memory with domain using the given > label". Aha. This is good news. Thanks. Looks like I need to dig deeper into XSM... > > > >>> > >>>3. Changes in scheduling to allow TEE mediator use credits/slices of > >>> calling guest. > >>> > >>>4. Support boilerplate code in stubdom. You know, you can't simply > >>> write mediator in stubdom. You need a kernel. You need to > >>> maintain it. > >> > >>Well, in a way or another someone will have to maintain the mediator... The > >>kernel does not need to be specific to TEE, it could be a unikernel. > >Right. But for me XEN looks better maintained "kernel" :) > >IMHO, XEN is mature, there are less bugs (especially security ones) > >than in any other kernel. > > > >>And before you say again no-one in the community seem to be interested. I > >>should remind you that Arm is working on it (see development update). > >You are talking about that "unicore" project by NEC guys? Sorry, > >can't find mentioned development update. Looks like search on markmail > >is down (or I'm doing something terribly wrong). > > Sorry, I meant Mini-OS. I don't know any work on "unicore" for Arm64 for > now. Ah, good to hear. So there will be active maintainer for ARM64 Mini-OS? Sorry, still can't find that "development update". > > > >>> > >>>This is a lot of a work. It requires changes in generic parts of XEN. > >>>I fear it will be very hard to upstream such changes, because no one > >>>sees an immediate value in them. How do you think, what are my chances > >>>to upstream this? > >> > >>It is fairly annoying to see you justifying back most of this thread with > >>"no one sees an immediate value in them". > >> > >>I am not the only maintainers in Xen, so effectively can't promise whether > >>it is going to be upstreamed. But I believe the community has been very > >>supportive so far, a lot of discussions happened (see [2]) because of the > >>OP-TEE support. So what more do you expect from us? > >I'm sorry, I didn't mean to offend you or someone else. You, guys, can > >be harsh sometimes, but I really appreciate help provided by the > >community. And I, certainly, don't ask you about any guarantees or > >something of that sort. > > > >I'm just bothered by amount of required work and by upstreaming > >process. But this is not a strong argument against mediators in > >stubdoms, I think :) > > > >Currently I'm developing virtualization support in OP-TEE, so in > >meantime we'll have much time to discuss mediators and stubdomain > >approach (if you have time). To test this feature in OP-TEE I'm > >extending this RFC, making optee.c to look like full-scale mediator. > >I need to do this anyways, to test OP-TEE. When I'll finish, I can > >show you how mediator can look like. Maybe this will persuade you to > >one or another approach. > > I think this would be useful. Can you also keep both Stefano (I assume he > wants too) and I in the loop for the OP-TEE virtualization side? Okay. I'm planning to produce first RFC for OP-TEE folks in a few days. I'll subscribe you. In then meantime you can check out [2] [1] http://markmail.org/message/tdbg5mgxjvsoj2ph [2] https://github.com/OP-TEE/optee_os/issues/1890 -- WBR, Volodymyr Babchuk _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.