[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFD] OP-TEE (and probably other TEEs) support



On 28 November 2016 at 22:10, Julien Grall <julien.grall@xxxxxxx> wrote:
>
>
> On 28/11/16 18:09, Volodymyr Babchuk wrote:
>>
>> Hello,
>
>
> Hello Volodymyr,
>
>> On 28 November 2016 at 18:14, Julien Grall <julien.grall@xxxxxxx> wrote:
>>>
>>> On 24/11/16 21:10, Volodymyr Babchuk wrote:
>>>>
>>>> My name is Volodymyr Babchuk, I'm working on EPAM Systems with bunch
>>>> of other guys like Artem Mygaiev or Andrii Anisov. My responsibility
>>>> there is security in embedded systems.
>>>>
>>>> I would like to discuss approaches to OP-TEE support in XEN.
>>>
>>>
>>>
>>> Thank you for sharing this, I am CC-ing some people who showed interest
>>> on
>>> accessing trusted firmware from the guest.
>>>
>>> In the future, please try to CC relevant people (in this case ARM
>>> maintainers) to avoid any delay on the answer.
>>
>> Thanks. I never worked with XEN community earlier, so I don't know who is
>> who :)
>
>
> You can give a look to the MAINTAINERS file at the root xen.git.
>
> [...]
>
>>>> You can find patches at [1] if you are interested.
>>>> During working on this PoC I have identified main questions that
>>>> should be answered:
>>>>
>>>> On XEN side:
>>>> 1. SMC handling in XEN. There are many different SMCs and only portion
>>>> of them belong to TEE. We need some SMC dispatcher that will route
>>>> calls to different subsystems. Like PSCI calls to PSCI subsystem, TEE
>>>> calls to TEE subsystem.
>>>
>>>
>>>
>>> So from my understanding of this paragraph, all SMC TEE calls should have
>>> a
>>> guest ID in the command. We don't expect command affecting all TEE.
>>> Correct?
>>
>> Yes. Idea is to trap SMC, alter it, add guest ID (into r7, as SMCCC
>> says) and then
>> do real SMC to pass it to TEE.
>>
>> But I'm not get this: "We don't expect command affecting all TEE".
>> What did you mean?
>
>
> I mean, is there any command that will affect the trusted OS (e.g reset it,
> or else) in whole and not only the session for a given guest?
Yes, there are such commands. For example there are command that
enables/disables caching for shared memory.
We should disable this caching, by the way.
SMC handler should manage commands like this.

>
>>
>>>
>>>>
>>>> 2. Support for different TEEs. There are OP-TEE, Google Trusty, TI
>>>> M-Shield... They all work thru SMC, but have different protocols.
>>>> Currently, we are aimed only to OP-TEE. But we need some generic API
>>>> in XEN, so support for new TEE can be easily added.
>>>
>>>
>>> For instance you
>>
>> Hm?
>>>
>>> Is there any  generic way to detect which TEE is been in used and the
>>> version?
>>
>> Yes, according to SMCCC, there call number 0xBF00FF01 that should
>> return Trusted OS UID.
>> OP-TEE supports this call. I hope, other TEEs also support it. In this
>> way we can which TrustedOS is running on host.
>
>
> Looking at the SMCC, this SMC call seems to be mandatory.
>>
>>>>
>>>> 3. TEE services. Hypervisor should inform TEE when new guest is
>>>> created or destroyed, it should tag SMCs to TEE with GuestID, so TEE
>>>> can isolate guest data on its side.
>>>>
>>>> 4. SMC mangling. RichOS communicates with TEE using shared buffers, by
>>>> providing physical memory addresses. Hypervisor should convert IPAs to
>>>> PAs.
>>>
>>>
>>>
>>> I am actually concerned about this bit. From my understanding, the
>>> hypervisor would need some knowledge of the SMC.
>>
>> Yes, it was my first idea - separate subsystem in the hypervisor that
>> handles SMC calls for different TEEs. This subsystems has a number of
>> backends. One for each TEE.
>>
>>> So are the OP-TEE SMC calls fully standardized? By that I mean they will
>>> not
>>> change across version?
>>
>> No, they are not standardized and they can change in the future.
>> OP-TEE tries to be backward-compatible, though. So hypervisor can drop
>> unknown capability flags in SMC call GET_CAPABILITIES. In this way it
>> can ensure that guest will use only  APIs that are known by
>> hypervisor.
>>
>>> How about other TEE?
>>
>> I can't say for sure. But I think, situation is the same as with OP-TEE
>
>
> By any chance, is there a TEE specification out somewhere?
Yes. There are GlobalPlatform API specs. You can find them at [3]
Probably you will be interested in "TEE System Architecture v1.0".

>
>>
>>> If not, then it might be worth to consider a 3rd solution where the TEE
>>> SMC
>>> calls are forwarded to a specific domain handling the SMC on behalf of
>>> the
>>> guests. This would allow to upgrade the TEE layer without having to
>>> upgrade
>>> the hypervisor.
>>
>> Yes, this is good idea. How this can look? I imagine following flow:
>> Hypervisor traps SMC, uses event channel to pass request to Dom0. Some
>> userspace daemon receives it, maps pages with request data, alters is
>> (e.g. by replacing IPAs with PAs), sends request to hypervisor to
>> issue real SMC, then alters response and only then returns data back
>> to guest.
>
>
> The event channel is only a way to notify (similar to an interrupt), you
> would need a shared memory page between the hypervisor and the client to
> communicate the SMC information.
>
> I was thinking to get advantage of the VM event API for trapping the SMC.
> But I am not sure if it is the best solution here. Stefano, do you have any
> opinions here?
>
>>
>> Is this even possible with current APIs available to dom0?
>
>
> It is always possible to extend the API if something is missing :).
Yes. On other hand I don't like idea that some domain can map any
memory page of other domain to play with SMC calls. We can't use grefs
there. So, service domain should be able to map any memory page it
wants. This is unsecure.

>>
>> I can see only one benefit there - this code will be not in
>> hypervisor. And there are number of drawbacks:
>>
>> Stability: if this userspace demon will crash or get killed by, say,
>> OOM, we will lose information about all opened sessions, mapped shared
>> buffers, etc.That would be complete disaster.
>
>
> I disagree on your statement, you would gain in isolation. If your userspace
> crashes (because of an emulation bug), you will only loose access to TEE for
> a bit. If the hypervisor crashes (because of an emulation bug), then you
> take down the platform. I agree that you lose information when the userspace
> app is crashing but your platform is still up. Isn't it the most important?
This is arguable and depends on what you consider more valuable:
system security or system stability.
I'm standing on security point.

>
> Note that I think it would be "fairly easy" to implement code to reset
> everything or having a backup on the side.
Actually I can image how ARM Trusted FW can restart SPD (TEE). But
again, this can compromise security.

By the way. Is there any lets say official ARM position on how
hypervisor should interact with Secure World? Maybe you can asks
someone, who designed TZ?

Also, Donlgi in parallel thread rightly mentioned TCB - Trusted
Computing Base, which is set of components that are critical for
security. It is mandatory to keep TCB as minimal as possible. And
introducing handler in domain is not the way to minimize TCB
footprint.

>> Performance: how big will be latency introduced by switching between
>> hypervisor, dom0 SVC and USR modes? I have seen use case where TEE is
>> part of video playback pipe (it decodes DRM media).
>> There also can be questions about security, but Dom0 in any case can
>> access any memory from any guest.
>
>
> But those concerns would be the same in the hypervisor, right? If your
> emulation is buggy then a guest would get access to all the memory.
Yes, but I hope that is harder to compromise hypervisor, than to
compromise guest domain.

>> But I really like the idea, because I don't want to mess with
>> hypervisor when I don't need to. So, how do you think, how it will
>> affect performance?
>
>
> I can't tell here. I would recommend you to try a quick prototype (e.g
> receiving and sending SMC) and see what would be the overhead.
>
> When I wrote my previous e-mail, I mentioned "specific domain", because I
> don't think it is strictly necessary to forward the SMC to DOM0. If you are
> concern about overloading DOM0, you could have a separate service domain
> that would handle TEE for you. You could have your "custom OS" handling TEE
> request directly in kernel space (i.e SVC).
Hmmm. I heard something about Unikernel domains. This is what you want
to propose?

> This would be up to the developer of this TEE-layer to decide what to do.
>
>
>>
>>>
>>>> Currently I'm rewriting parts of OP-TEE to make it support arbitrary
>>>> buffers originated from RichOS.
>>>>
>>>> 5. Events from TEE. This is hard topic. Sometimes OP-TEE needs some
>>>> services from RichOS. For example it wants Linux to service pending
>>>> IRQ request, or allocate portion of shared memory, or lock calling
>>>> thread, etc. This is called "RPC request". To do RPC request OP-TEE
>>>> initiates return to Normal World, but it sets special return code to
>>>> indicate that Linux should do some job for OP-TEE. When Linux finishes
>>>> work, it initiates another SMC with code like "I have finished RPC
>>>> request" and OP-TEE resumes its execution.
>>>> OP-TEE mutexes create problem there. We don't  want to sleep in secure
>>>> state, so when OP-TEE thread gets blocked on a mutex, it issues RPC
>>>> request that asks calling thread to wait on wait queue. When mutex
>>>> owner unlocks it, that another thread also issues RPC to wake up first
>>>> thread.
>>>> This works perfectly when there are one OS (or one guest). But when
>>>> there are many, it is possible that request from one guest blocks
>>>> another guest. That another guest will wait on wait queue, but there
>>>> will be no one, who can wake it up. So we need another mechanism to
>>>> wake up sleeping threads. Obvious candidate is IPI. There are 8
>>>> non-secure IPIs that all are used by linux kernel. By there are also 8
>>>> secure IPIs. I think OP-TEE can use one of those to deliver events to
>>>> Normal World. But this will require changes to OP-TEE, XEN and Linux
>>>> kernel.
>>>
>>>
>>>
>>> Before giving suggestion here. I would like to confirm that by IPI you
>>> mean
>>> SGI, right?
>>
>> Yes. Kernel calls them IPI, so did I.
>> If I remember ARM GIC TRM well, it recommends to give 8 SGIs to Normal
>> World and 8 to Secure World.
>
>
> You said "ARM GIC TRM", so I guess you are speaking about a specific GIC
> implementation. I looked at the GIC specific (ARM IHI 0048B.b) and can't
> find a such suggestion.

Page 2-25:

In any system that implements the ARM Security Extensions, to support
a consistent model for message passing between processors, ARM
strongly recommends that all processors reserve:
* ID0-ID7 for Non-secure interrupts
* ID8-ID15 for Secure interrupts.

>> Hypervisor can use one of secure SGIs to deliver events to guests.
>> But, actually, this feature will be needed only virtualized
>> environment, so I think we can use XEN events there.
>> Jens, can you please comment on this? I can't imagine use case when
>> OP-TEE need to send SGI to normal world, when there are no
>> virtualization. But maybe I'm missing something?
>>
>>>
>>>>
>>>> 6. Some mechanism to control which guests can work with TEE. At this
>>>> time I have no idea how this should work.
>>>
>>>
>>>
>>> Probably a toolstack option "TEE enabled".
>>
>> So there will be some additional flag in guest descriptor structure?
>
>
> That's the idea.
>
> Regards,
>
> --
> Julien Grall

[3] http://www.globalplatform.org/specificationsdevice.asp

-- 
WBR Volodymyr Babchuk aka lorc [+380976646013]
mailto: vlad.babchuk@xxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.