Xen project Mailing List

Re: [PING] Re: [PATCH] xen/arm: optee: Allocate anonymous domheap pages

To: Stefano Stabellini <sstabellini@xxxxxxxxxx>

From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>

Date: Thu, 7 Oct 2021 17:57:08 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=K8EISo5WwoNX9rPB/VLTqw0ZVyskXDLqo42UTwgybnk=; b=CIpApnB0pVwVwm71P69E2KSdvMjnpAeo4zDU3XhDHrJUM+Ds55fotKOtPVSoEz5RWBwp89uFdYj/Cr3pprpKoUdPzsA8iU4nwRrxwZalBBCQWVAIWkivy9OCwEN8e74LGPcr+TlxopJmyTXBY/486HrqrbJVcKGKNaH70zMQXZM3lfEbHh1W0R546RiunXiOdorCRWl/5t5Lqldf+KNI4V8iSJ9Q5IugDOk4A4hp1uw4oNoXTa5OwpOpGFgQOEDcs3R3vtdI0m/nk75BDYWeueF4GZpb+mTz3x6mSg/ljkPDoFmDm7LcapvKvCz71g9enyJfhwVZq2QR6zj8CeMPjA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KC35jQqNx3a4SgOKr89I5KUtiMvvNBWzyhGfcXipsighzB7j+yKZ2+Fv5jOAWXLjbOCAFMKvOperX/hJSae6lXmhWhwAtyqsRm5XPB8AVaiIxJfzuGSzTQ+yj7T9Y4vTgQEaP3krj+jPuduRlExwsDuSUE9yn3h/gxi3UAZaB2savJ5zsMP3AEo/Epz5ANeHHCfS0gZAvsUeAXIB4u/V6OSnJmXGmA6ecR6v84gLm8pMs5SbXN7bmn6jFySgapP1YtUdmgF3Mzu/HisJjeMSLaDG0/8iHEBAbykGFx1SxOgJ+Q8pZE3tOLCfucftCUBSkzYWUnhtjraJQ6lBRmrdLQ==

Authentication-results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=epam.com;

Cc: Oleksandr <olekstysh@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Julien Grall <jgrall@xxxxxxxxxx>

Delivery-date: Thu, 07 Oct 2021 17:57:25 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHXoyUNSkY32SPluU6N5DMbFmm656uyJ0MAgAAJFoCAE/PzgIAAsCsAgAESVAA=

Thread-topic: [PING] Re: [PATCH] xen/arm: optee: Allocate anonymous domheap pages

Hi Stefano, Stefano Stabellini <sstabellini@xxxxxxxxxx> writes: > On Wed, 6 Oct 2021, Oleksandr wrote: >> Hello all >> >> Gentle reminder. > > Many thanks for the ping, this patch fell off my radar. > > > >> On 23.09.21 23:57, Volodymyr Babchuk wrote: >> > Hi Stefano, >> > >> > Stefano Stabellini <sstabellini@xxxxxxxxxx> writes: >> > >> > > On Mon, 6 Sep 2021, Oleksandr Tyshchenko wrote: >> > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx> >> > > > >> > > > Allocate anonymous domheap pages as there is no strict need to >> > > > account them to a particular domain. >> > > > >> > > > Since XSA-383 "xen/arm: Restrict the amount of memory that dom0less >> > > > domU and dom0 can allocate" the dom0 cannot allocate memory outside >> > > > of the pre-allocated region. This means if we try to allocate >> > > > non-anonymous page to be accounted to dom0 we will get an >> > > > over-allocation issue when assigning that page to the domain. >> > > > The anonymous page, in turn, is not assigned to any domain. >> > > > >> > > > CC: Julien Grall <jgrall@xxxxxxxxxx> >> > > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx> >> > > > Acked-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> >> > > Only one question, which is more architectural: given that these pages >> > > are "unlimited", could the guest exploit the interface somehow to force >> > > Xen to allocate an very high number of anonymous pages? >> > > >> > > E.g. could a domain call OPTEE_SMC_RPC_FUNC_ALLOC in a loop to force Xen >> > > to exaust all memory pages? >> > Generally, OP-TEE mediator tracks all resources allocated and imposes >> > limits on them. >> > >> > OPTEE_SMC_RPC_FUNC_ALLOC case is a bit different, because it is issued >> > not by domain, but by OP-TEE itself. As OP-TEE is more trusted piece of >> > system we allow it to request as many buffers as it wants. Also, we know >> > that OP-TEE asks only for one such buffer per every standard call. And >> > number of simultaneous calls is limited by number of OP-TEE threads, >> > which is quite low: typically only two. > > So let me repeat it differently to see if I understood correctly: > > - OPTEE_SMC_RPC_FUNC_ALLOC is only called by OP-TEE, not by the domain > - OPTEE is trusted and only call it twice anyway Correct. > I am OK with this argument, but do we have a check to make sure a domU > cannot issue OPTEE_SMC_RPC_FUNC_ALLOC? domU can't issue any RPC, because all RPCs are issued from OP-TEE side. This is the nature of RPC - OP-TEE requests Normal World for some service. But of course, Normal World can perform certain actions that will make OP-TEE to issue a RPC. I discuss this in depth below. > > Looking at the patch, there are other two places, in addition to > OPTEE_SMC_RPC_FUNC_ALLOC, where the anonymous memory pages can be > allocated: > > 1) copy_std_request > 2) translate_noncontig > > We need to prove that neither 1) or 2) can result in a domU exausting > Xen memory. > > In the case of 1), it looks like the memory is freed before returning to > the DomU, right? If so, it should be no problem? Yes, mediator makes shadow copy of every request buffer to hide translated addresses from the guest. Number of requests is limited by number of OP-TEE threads. > In the case of 2), it looks like the memory could outlive the call where > it is allocated. Is there any kind of protection against issuing > something like OPTEE_MSG_ATTR_TYPE_TMEM_INOUT in a loop? Is it OP-TEE > itself that would refuse the attempt? Thus, the idea is that > do_call_with_arg will return error and we'll just free the memory there? Well, translate_noncontig() calls allocate_optee_shm_buf() which counts all allocated buffers. So you can't call it more than MAX_SHM_BUFFER_COUNT times, without de-allocating previous memory. But, thanks to your question, I have found a bug there: memory is not freed if allocate_optee_shm_buf() fails. I'll prepare patch later today. > I cannot see a check for errors returned by do_call_with_arg and memory > freeing done because of that. Sorry I am not super familiar with the > code, I am just trying to make sure we are not offering to DomUs an easy > way to crash the system. I tried to eliminate all possibilities for a guest to crash the system. Of course, this does not mean that there are none of them. And yes, code is a bit hard to understand, because calls to OP-TEE are stateful and you need to account for that state. From NW and SW this looks quite fine, because state is handled naturally. But mediator sits in a middle, so it's implementation is a bit messy. I'll try to explain what is going on, so you it will be easier to understand logic in the mediator. There are two types of OP-TEE calls: fast calls and standard calls. Fast call is simple: call SMC and get result. It does not allocate thread context in OP-TEE and is non-preemptive. So yes, it should be fast. It is used for simple things like "get OP-TEE version" or "exchange capabilities". It is easy to handle them in mediator: just forward the call, check result, return it back to a guest. Standard calls are stateful. OP-TEE allocates thread for each call. This call can be preempted either by IRQ or by RPC. For consistency IRQ return is also considered as special type of RPC. So, in general one standard call can consist of series of SMCs: --> SMC with request <-- RPC return (like IRQ) --> SMC "resume call" <-- RPC return (like "read disk") --> SMC "resume call" <-- RPC return (like "send network packet") --> SMC "resume call" ... <-- Final return There are many types of RPCs: "handle IRQ", additional shared buffer allocation/de-allocation, RPMB access, disks access, network access, synchronization primitives (when OP-TEE thread is gets blocked on a mutex), etc. Two more things that makes all this worse: Normal World can register shared buffer with OP-TEE. Such buffer can live indefinitely long. Also, Normal World decides when to resume call. For example, calling process can be preempted and then resumed seconds later. Misbehaving guest can decide to not resume call at all. As I said, I tried to take all this things into account. There are basically 3 types of objects that can lead to memory allocation on Xen side: 1. Standard call context. Besides memory space for struct optee_std_call itself it allocates page for a shadow buffer, where argument addresses are translated by Xen. Number of this objects is limited by number of OP-TEE threads: count = atomic_add_unless(&ctx->call_count, 1, max_optee_threads); if ( count == max_optee_threads ) return ERR_PTR(-ENOSPC); 2. Shared buffer. This is a buffer shared by guest with OP-TEE. It can be either temporary buffer which is shared for one standard call duration, or registered shared buffer, which is remains active until it is de-registered. This is where translate_noncontig() comes into play. Number of this buffers is limited in allocate_optee_shm_buf(): count = atomic_add_unless(&ctx->optee_shm_buf_count, 1, MAX_SHM_BUFFER_COUNT); if ( count == MAX_SHM_BUFFER_COUNT ) return ERR_PTR(-ENOMEM); 3. Shared RPC buffer. This is very special kind of buffer. Basically, OP-TEE needs some shared memory to provide RPC call parameters. So it requests buffer from Normal World. There is no hard limit on this from mediator side, because, as I told earlier, OP-TEE itself limits number of this buffers. There is no cases when more that one buffer will be allocated per OP-TEE thread. This type of buffer is used only to process RPC requests themselves. OP-TEE can request more buffers via RPC, but they will fall to p.2: NW uses separate request to register buffer and then returns its handle in the preempted call. Apart from those two limits, there is a limit on total number of pages which is shared between guest and OP-TEE: MAX_TOTAL_SMH_BUF_PG. This limit is for a case when guest tries to allocate few really BIG buffers. > It looks like they could be called from one of the OPTEE operations that > a domU could request? Is there a limit for them? Yes, there are limits, as I described above. Also, bear in mind that resources available to OP-TEE are also quite limited. So, in case of some breach in mediator, OP-TEE will give up first. This of course is not an excuse to have bugs in the mediator... -- Volodymyr Babchuk at EPAM

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.