[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator

To: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>
From: Julien Grall <julien.grall@xxxxxxxxxx>
Date: Thu, 2 Nov 2017 17:49:12 +0000
Cc: Julien Grall <julien.grall@xxxxxxx>, nd@xxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, jens.wiklander@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxx
Delivery-date: Thu, 02 Nov 2017 17:49:31 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi Volodymyr,

On 02/11/17 16:53, Volodymyr Babchuk wrote:

On Thu, Nov 02, 2017 at 01:17:26PM +0000, Julien Grall wrote:

On 24/10/17 20:02, Volodymyr Babchuk wrote:

If it is not safe, this means you have a whitelist solution and therefore
tie Xen to a specific OP-TEE version. So if you need to use a new function
you would need to upgrade Xen making the code of using new version
potentially high.

Yes, any ABI change between OP-TEE and its clients will require mediator
upgrade. Luckilly, OP-TEE maintains ABI backward-compatible, so if you'll
install old XEN and new OP-TEE, OP-TEE will use only that subset of ABI,
which is known to XEN.

Also, correct me if I am wrong, OP-TEE is a BSD 2-Clause. This means you
impose anyone wanted to modify OP-TEE for their own purpose can make a
closed version of the TEE. But if you need to introspect/whitelist call, you
impose the vendor to expose their API.

Basically yes. Is this bad? OP-TEE driver in Linux is licensed under GPL v2.
If vendor modifies interface between OP-TEE and Linux, they anyways obligued
to expose API.


Pardon me for potential stupid questions, my knowledge of OP-TEE is limited.

My understanding is the OP-TEE will provide a generic way to access
different Trusted Application. While OP-TEE API may be generic, the TA API
is custom. AFAICT the latter is not part of Linux driver.

Yes, you are perfectly right there.

So here my questions:
        1) Are you planning allow all the guests to access every Trusted
Applications?

This is a good question. There are two types of TAs supported in
OP-TEE: real TAs (as they are described in GlobalPlatform specs) and
PseudoTAs.  The latter ones are statically linked right into OP-TEE
kernel and execute at S-EL1 level.
Real TAs are provided by client. That means that NW userspace
supplicant loads TA into OP-TEE. OP-TEE checks signature for the TA
and then runs it in S-EL0.
So, I'm planning to allow client to work with any real TA. I can't see
real problem there.


Are the real TAs going to be shared between guests? Or will each guest have
their own one?

No, we don't plan this. At least at this momoent. Every guest will have
own instance of TA.

Will you allow every guests loading real TAs?

Yes, if guest has access to TEE, it can load TA. Either there is no
sense to use TEE. OP-TEE core itself does not provide useful services
to clients.


In a previous e-mail you mentioned OP-TEE has limited memory. How will you
ensure that guest A will not use all the memory of OP-TEE and prevent guest
B to load TAs?

There are no way to do this right now. Even on bare-metal system, one client
call load huge TA or eat up memory in another way to prevent other clients
to use OP-TEE. This is known limitation. It can be mitigated by enforcing
quotas.

Yes, but those clients only serve one OS. Here you would serve multipleOSes, clients from OS A could eat up the memory and prevent a clientfrom OS B to run.

This could be a serious issue depending on how important the clients forOS are.


So likely enforcing quotas will be needed.

[...]

Not really, you could the domain could block when issuing an SMC until the
mediator is up and running.

Do you mean, that if domain tries to execute SMC, and mediator is not
ready, then hypervisor should pause all domain's vCPUs? That can be
destructive for hw domain.


Xen is free to unschedule any vCPU at any time. So why would it be
destructive?

Suppose that mediator domain needs 0.5s to boot up and be ready to
serve calls. For half of a second HW domain will be blocked. I don't
like the idea, that it will not be able to serve IRQs and other
requests. IMHO, it is okay for DomU, but not for Dom0.

And yes, it seems obvious, but I want to say this explicitly: generic
TEE mediator framework should and will use XSM to control which domain
can work with TEE. So, if you don't trust your guest - don't let it
to call TEE at all.


Correct me if I am wrong. TEE could be used by Android guest which likely
run the user apps... right? So are you saying you fully trust that guest and
obviously the user installing rogue app?

I don't think that app downloaded from Play Marget can access OP-TEE directly.
OP-TEE can be used by Android itself as a key storage or to access to a SE,
for example. But 3rd app that issues TEE calls... I don't think so.


You didn't get my point here. That rogue app may be able to break into
kernel via an exploit or have enough privilege to break the guest. Who knows
what it will be able to do after...

Only what hypervisor and TEE will allow it to do. Look, OP-TEE was not designed
to rule the machine. There is ARM TF for that :) OP-TEE's task is to provide
some safer environment for sensitive data and code. This environment has
well-defined interfaces and is desgined to be as safe as possible.

If rogue app breaks into kernel, then it can issue any SMC which it wants.
But OP-TEE does not trust to NW. Hypervisor does not trust to guests.
Mediator should be written in the same way.

So, what can do rogue kernel? As I know - it can cause DoS in OP-TEE. This is
known issue. If there is a security bug in OP-TEE, it probably can overcome
whole system. But this is true for any system running OP-TEE.


I agree that if you take over OP-TEE, you will take over any system. This is
not specific to hypervisor.

Yes. But it just occured to me that mediator+OP-TEE *can* be more
secure then just OP-TEE. You see, mediator should perform own security
checks before forwarding call to OP-TEE. So if OP-TEE misses
something, mediator can back it up. I wouldn't rely on this. It just
interesting thought :-)

Baremetal OS taking down the platform will only harm itself. A guest OS
could harm the whole platform.

Can't argument with that. I think that this feature (shared TEE) is
not suitable for, say, VPSes. But it can work just fine on smartphones
or on another embedded devices, where vendor defines whole system.


I guess your use case is "vendor defines whole system". But I am struggling
understand how this would more suitable there.

Excuse me... "There" - it is where exactly?


"vendor defines whole system".

That guest OS may be "controlled" by the user. So how is that safer?

Can you please define what is "safe" and "unsafe" in this context?

Lets take a look at whole picture. I can see the following attacks:

1) DoS attack. One domain spends all OP-TEE resources, other domains
    can't work with it. As I said earlier, this is know limitation.

2) Mediator crash. Sort of DoS, if mediator can't restart properly.

3) OP-TEE crash. This crashes whole system.

4) Virtualization breach. Attacker gains control over mediator ->
    control over all TEE-enabled guests.

5) Virtualization breach. Attacker gains control over hypervisor ->
    control over all guests.

6) Virtualization breach. Attacker gains control over OP-TEE ->
    control over whole system, including firmware.

Now it would be great to give you likehood for every attack type. But,
obvioulsy I have no such numbers. I can only speculate about this.

Returning to your question... To what extent guest OS can be controlled
by user? Can user execute arbitrary code at EL1 for example? Or it can
install only apps prebuilt by system vendor?

What bad things will happen if user will compromise the whole system?

Which guests will also run on the same system? Which subset of them
will access OP-TEE?

If you can asnwer to this questions, I can tell you, if it is safe
to use OP-TEE + virtualization on your system.

I don't make any end product and I have no idea what kind of guestswould be run on top.

So if I had to answers to those questions, I would consider all theguests potentially nasty and therefore making sure the attack surface islimited and understood.

You really have to ask yourself what kinds of guest you will run on thatplatform and assess the risk.

If you tell me you are going to run safety critical in one VM andanother with Android. Then I would be looking at limiting the attacksurface of the Android guest.

If you tell me that the user will only be able to install pre-built appsby system vendor. Then I will have some trouble to believe it is securegiven how complex is an operating system.

And I am not even mentioning that allowing the user to install pre-builtapps by system vendor likely means having network/bluetooth access. I amsure you have seen recently vulnerability...


For some "generic" system I can say that it is pretty "safe" (except
that problem with OP-TEE resources).

What I am not sure yet, maybe because of my lack of knowledge around OP-TEE,
who is going to protect a TA to access all the NS memory?

TAs is runing in S-EL0. It can't control MMU. Before every TA
invocation, OP-TEE setups MMU in such way, so TA sees only shared
memory arguments passed by client for this particular invocation.


Can you give a bit more details here? Particularly what is the life of that
mapped region? Is it just for a command? If not, who is going to unmap it
and when?

Yes, this map is created for every call. TA code and data are mapped always,
obviously.


Where does the TA code and data live? Is it in secure or non-secure memory?

But parameters are mapped every call and only needed ones.
Example: I have shared buffers A, B, C, D.

1) I call OpenSession(TA_UUID, A, B).
    TA sees only buffers A, B (okay, actually it sees whole page, because
    buffer is mapped from userspace).

2) I call InvokeCommand(Session, CMD_ID, B, C).
    TA sees only buffers B & C.

3) I call InvokeCommand(Session, CMD_ID, A, D).
    TA sees only buffers A & D.

Note, that such buffers are not mapped at OP-TEE address space at all.
They will be mapped only to TA address space.

To confirm, what you are saying is as soon as any call is returned byTA, the region will be unmapped from the TA address space?


[...]

To be clear, this series don't look controversial at least for OP-TEE. What
I am more concerned is about DomU supports.

Your concern is that rogue DomU can compromise whole system, right?


Yes. You seem to assume that DomU using TEE will always be trusted, I think
this is the wrong approach if the use is able to interact directly with
those guests. See above.

No, I am not assuming that DomU that calls TEE should be trusted. Why do you
think so? It should be able to use TEE services, but this does not mean that
XEN should trust it.


In a previous answer you said: "So, if you don't trust your guest - don't
let it". For me, this clearly means you consider that DomU using TEE are
trusted.

So can you clarify by what you mean by trust then?

Well... In real world "trust" isn't binary option. You don't want to
allow all domains to access TEE. Breached TEE user domain doesn't
automatically mean that your whole system is compromised. But this
certainly increases attack surface. So it is safer to give TEE access
only to those domains, which really require it. You can call them
sligtly more trusted, then others.


Do you have an example of guest you would slightly trust more?

I have an example of guest I would trust less: if I'm running server,
and I'm selling virtual machines on that server, I don't want to them
to access TEE.


Make sense.


I will trust slightly more to my own guest.


I kind of agree if there are either no interaction with the user or the user
is not able to gain privilege permissions.

Okay, if user can execute arbitrary code at EL1... Even then nothing bad
will happen. They must be able to hack mediator/hypervisor/OP-TEE to realy
gain priviegs in system.

My worry here is you base the trust on OP-TEE and not only thehypervisor. At the moment we had to trust the hardware to do the rightthing and the software is owned by Xen.

Now you are telling me, we have this TEE running in EL3 and have totrust him to do the isolation between guests. Until the last 2 e-mails,it was not clear for me how OP-TEE could ensure this isolation.

I would advise to explain a bit more in your cover letter of your nextversion the design of OP-TEE. This would help people to see how this canwork with the hypervisor and also understanding the consequence...


Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator
  - From: Volodymyr Babchuk

References:
- Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator
  - From: Julien Grall
- Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator
  - From: Volodymyr Babchuk

Prev by Date: Re: [Xen-devel] [Qemu-devel] [PATCH v2] hw/display/xenfb: Simulate auto-repeat key events
Next by Date: Re: [Xen-devel] [Qemu-devel] [PATCH v2] hw/display/xenfb: Simulate auto-repeat key events
Previous by thread: Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator
Next by thread: Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.