[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PING] Re: [PATCH] xen/arm: optee: Allocate anonymous domheap pages
On Thu, 7 Oct 2021, Volodymyr Babchuk wrote: > > On Thu, 7 Oct 2021, Volodymyr Babchuk wrote: > >> Hi Stefano, > >> > >> Stefano Stabellini <sstabellini@xxxxxxxxxx> writes: > >> > >> > On Wed, 6 Oct 2021, Oleksandr wrote: > >> >> Hello all > >> >> > >> >> Gentle reminder. > >> > > >> > Many thanks for the ping, this patch fell off my radar. > >> > > >> > > >> > > >> >> On 23.09.21 23:57, Volodymyr Babchuk wrote: > >> >> > Hi Stefano, > >> >> > > >> >> > Stefano Stabellini <sstabellini@xxxxxxxxxx> writes: > >> >> > > >> >> > > On Mon, 6 Sep 2021, Oleksandr Tyshchenko wrote: > >> >> > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx> > >> >> > > > > >> >> > > > Allocate anonymous domheap pages as there is no strict need to > >> >> > > > account them to a particular domain. > >> >> > > > > >> >> > > > Since XSA-383 "xen/arm: Restrict the amount of memory that > >> >> > > > dom0less > >> >> > > > domU and dom0 can allocate" the dom0 cannot allocate memory > >> >> > > > outside > >> >> > > > of the pre-allocated region. This means if we try to allocate > >> >> > > > non-anonymous page to be accounted to dom0 we will get an > >> >> > > > over-allocation issue when assigning that page to the domain. > >> >> > > > The anonymous page, in turn, is not assigned to any domain. > >> >> > > > > >> >> > > > CC: Julien Grall <jgrall@xxxxxxxxxx> > >> >> > > > Signed-off-by: Oleksandr Tyshchenko > >> >> > > > <oleksandr_tyshchenko@xxxxxxxx> > >> >> > > > Acked-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> > >> >> > > Only one question, which is more architectural: given that these > >> >> > > pages > >> >> > > are "unlimited", could the guest exploit the interface somehow to > >> >> > > force > >> >> > > Xen to allocate an very high number of anonymous pages? > >> >> > > > >> >> > > E.g. could a domain call OPTEE_SMC_RPC_FUNC_ALLOC in a loop to > >> >> > > force Xen > >> >> > > to exaust all memory pages? > >> >> > Generally, OP-TEE mediator tracks all resources allocated and imposes > >> >> > limits on them. > >> >> > > >> >> > OPTEE_SMC_RPC_FUNC_ALLOC case is a bit different, because it is issued > >> >> > not by domain, but by OP-TEE itself. As OP-TEE is more trusted piece > >> >> > of > >> >> > system we allow it to request as many buffers as it wants. Also, we > >> >> > know > >> >> > that OP-TEE asks only for one such buffer per every standard call. And > >> >> > number of simultaneous calls is limited by number of OP-TEE threads, > >> >> > which is quite low: typically only two. > >> > > >> > So let me repeat it differently to see if I understood correctly: > >> > > >> > - OPTEE_SMC_RPC_FUNC_ALLOC is only called by OP-TEE, not by the domain > >> > - OPTEE is trusted and only call it twice anyway > >> > >> Correct. > >> > >> > I am OK with this argument, but do we have a check to make sure a domU > >> > cannot issue OPTEE_SMC_RPC_FUNC_ALLOC? > >> > >> domU can't issue any RPC, because all RPCs are issued from OP-TEE > >> side. This is the nature of RPC - OP-TEE requests Normal World for some > >> service. But of course, Normal World can perform certain actions that > >> will make OP-TEE to issue a RPC. I discuss this in depth below. > >> > >> > > >> > Looking at the patch, there are other two places, in addition to > >> > OPTEE_SMC_RPC_FUNC_ALLOC, where the anonymous memory pages can be > >> > allocated: > >> > > >> > 1) copy_std_request > >> > 2) translate_noncontig > >> > > >> > We need to prove that neither 1) or 2) can result in a domU exausting > >> > Xen memory. > >> > > >> > In the case of 1), it looks like the memory is freed before returning to > >> > the DomU, right? If so, it should be no problem? > >> > >> Yes, mediator makes shadow copy of every request buffer to hide > >> translated addresses from the guest. Number of requests is limited by > >> number of OP-TEE threads. > >> > >> > In the case of 2), it looks like the memory could outlive the call where > >> > it is allocated. Is there any kind of protection against issuing > >> > something like OPTEE_MSG_ATTR_TYPE_TMEM_INOUT in a loop? Is it OP-TEE > >> > itself that would refuse the attempt? Thus, the idea is that > >> > do_call_with_arg will return error and we'll just free the memory there? > >> > >> Well, translate_noncontig() calls allocate_optee_shm_buf() which counts > >> all allocated buffers. So you can't call it more than > >> MAX_SHM_BUFFER_COUNT times, without de-allocating previous memory. But, > >> thanks to your question, I have found a bug there: memory is not freed > >> if allocate_optee_shm_buf() fails. I'll prepare patch later today. > >> > >> > I cannot see a check for errors returned by do_call_with_arg and memory > >> > freeing done because of that. Sorry I am not super familiar with the > >> > code, I am just trying to make sure we are not offering to DomUs an easy > >> > way to crash the system. > >> > >> I tried to eliminate all possibilities for a guest to crash the > >> system. Of course, this does not mean that there are none of them. > >> > >> And yes, code is a bit hard to understand, because calls to OP-TEE are > >> stateful and you need to account for that state. From NW and SW this > >> looks quite fine, because state is handled naturally. But mediator sits > >> in a middle, so it's implementation is a bit messy. > >> > >> I'll try to explain what is going on, so you it will be easier to > >> understand logic in the mediator. > >> > >> There are two types of OP-TEE calls: fast calls and standard calls. Fast > >> call is simple: call SMC and get result. It does not allocate thread > >> context in OP-TEE and is non-preemptive. So yes, it should be fast. It > >> is used for simple things like "get OP-TEE version" or "exchange > >> capabilities". It is easy to handle them in mediator: just forward > >> the call, check result, return it back to a guest. > >> > >> Standard calls are stateful. OP-TEE allocates thread for each call. This > >> call can be preempted either by IRQ or by RPC. For consistency IRQ > >> return is also considered as special type of RPC. So, in general one > >> standard call can consist of series of SMCs: > >> > >> --> SMC with request > >> <-- RPC return (like IRQ) > >> --> SMC "resume call" > >> <-- RPC return (like "read disk") > >> --> SMC "resume call" > >> <-- RPC return (like "send network packet") > >> --> SMC "resume call" > >> ... > >> <-- Final return > >> > >> There are many types of RPCs: "handle IRQ", additional shared buffer > >> allocation/de-allocation, RPMB access, disks access, network access, > >> synchronization primitives (when OP-TEE thread is gets blocked on a > >> mutex), etc. > >> > >> Two more things that makes all this worse: Normal World can register > >> shared buffer with OP-TEE. Such buffer can live indefinitely > >> long. Also, Normal World decides when to resume call. For example, > >> calling process can be preempted and then resumed seconds > >> later. Misbehaving guest can decide to not resume call at all. > >> > >> As I said, I tried to take all this things into account. There are > >> basically 3 types of objects that can lead to memory allocation on Xen > >> side: > >> > >> 1. Standard call context. Besides memory space for struct optee_std_call > >> itself it allocates page for a shadow buffer, where argument addresses > >> are translated by Xen. Number of this objects is limited by number of > >> OP-TEE threads: > >> > >> count = atomic_add_unless(&ctx->call_count, 1, max_optee_threads); > >> if ( count == max_optee_threads ) > >> return ERR_PTR(-ENOSPC); > >> > >> 2. Shared buffer. This is a buffer shared by guest with OP-TEE. It can > >> be either temporary buffer which is shared for one standard call > >> duration, or registered shared buffer, which is remains active until it > >> is de-registered. This is where translate_noncontig() comes into play. > >> Number of this buffers is limited in allocate_optee_shm_buf(): > >> > >> count = atomic_add_unless(&ctx->optee_shm_buf_count, 1, > >> MAX_SHM_BUFFER_COUNT); > >> if ( count == MAX_SHM_BUFFER_COUNT ) > >> return ERR_PTR(-ENOMEM); > >> > >> 3. Shared RPC buffer. This is very special kind of buffer. Basically, > >> OP-TEE needs some shared memory to provide RPC call parameters. So it > >> requests buffer from Normal World. There is no hard limit on this from > >> mediator side, because, as I told earlier, OP-TEE itself limits number > >> of this buffers. There is no cases when more that one buffer will be > >> allocated per OP-TEE thread. This type of buffer is used only to process > >> RPC requests themselves. OP-TEE can request more buffers via RPC, but > >> they will fall to p.2: NW uses separate request to register buffer and > >> then returns its handle in the preempted call. > >> > >> > >> Apart from those two limits, there is a limit on total number of pages > >> which is shared between guest and OP-TEE: MAX_TOTAL_SMH_BUF_PG. This > >> limit is for a case when guest tries to allocate few really BIG buffers. > >> > >> > >> > It looks like they could be called from one of the OPTEE operations that > >> > a domU could request? Is there a limit for them? > >> > >> Yes, there are limits, as I described above. > >> > >> Also, bear in mind that resources available to OP-TEE are also quite > >> limited. So, in case of some breach in mediator, OP-TEE will give up > >> first. This of course is not an excuse to have bugs in the mediator... > > > > OK, thanks for the explanation. The reasons for my questions is that if > > the allocations are using the memory of DomU, then at worst DomU can run > > out of memory. But if the allocations are using anonymous memory, then > > the whole platform might run out of memory. We have issued XSAs for > > things like that in the past. > > > > This is why I am worried about this patch: if we apply it we really > > become reliant on these limits being implemented correctly. A bug can > > have much more severe consequences. > > Agree. > > > As you are the maintainer for this code, and this code is not security > > supported, I'll leave it up to you (also see the other email about > > moving optee to "supported, not security supported"). > > Yes, I've seen this email. Just didn't had time to write followup. > > > However, maybe a different solution would be to increase max_pages for a > > domain when optee is enabled? Maybe just by a few pages (as many as > > needed by the optee mediator)? Because if we did that, we wouldn't risk > > exposing DOS attack vectors for every bug in the mediator limits checks. > > > > The below adds a 10 pages slack. > > Well, I didn't know that such option is available. If this is a valid > approach and there are no objections from other maintainers, I'd rather > use it. I think it is a valid approach, and it is "more secure" than the other patch. I suggest that you send a patch for it so that if people can voice their objections if any. > Only one comment there is about number of pages. Maximal number of > domheap pages used per request is 6: one for request itself, one for RPC > buffer, 4 at most for request arguments. I checked OP-TEE configuration, > looks like some platforms allow up to 16 threads. This yields 6 * 16 = 96 > pages in total. If this is acceptable I'd set TEE_SLACK to 96. OK. Please add a good in-code comment explaining how you got to 96. > > diff --git a/xen/arch/arm/tee/tee.c b/xen/arch/arm/tee/tee.c > > index 3964a8a5cd..a3105f1a9a 100644 > > --- a/xen/arch/arm/tee/tee.c > > +++ b/xen/arch/arm/tee/tee.c > > @@ -38,8 +38,11 @@ bool tee_handle_call(struct cpu_user_regs *regs) > > return cur_mediator->ops->handle_call(regs); > > } > > > > +#define TEE_SLACK (10) > > int tee_domain_init(struct domain *d, uint16_t tee_type) > > { > > + int ret; > > + > > if ( tee_type == XEN_DOMCTL_CONFIG_TEE_NONE ) > > return 0; > > > > @@ -49,7 +52,15 @@ int tee_domain_init(struct domain *d, uint16_t tee_type) > > if ( cur_mediator->tee_type != tee_type ) > > return -EINVAL; > > > > - return cur_mediator->ops->domain_init(d); > > + ret = cur_mediator->ops->domain_init(d); > > + if ( ret < 0 ) > > + return ret; > > + > > + /* > > + * Increase maxmem for domains with TEE, the extra pages are used by > > + * the mediator > > + */ > > + d->max_pages += TEE_SLACK; > > } > > > > int tee_relinquish_resources(struct domain *d)
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |