[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC XEN PATCH] xen/arm: ffa: reclaim shared memory on guest destroy



Hi,

On Tue, Dec 5, 2023 at 9:14 AM Bertrand Marquis
<Bertrand.Marquis@xxxxxxx> wrote:
>
> Hi Julien,
>
> Thanks a lot for your review and comment, this is very helpful.
>
> > On 4 Dec 2023, at 20:24, Julien Grall <julien@xxxxxxx> wrote:
> >
> > Hi Jens,
> >
> > On 04/12/2023 07:55, Jens Wiklander wrote:
> >> When an FF-A enabled guest is destroyed it may leave behind memory
> >> shared with SPs. This memory must be reclaimed before it's reused or an
> >> SP may make changes to memory used by a new unrelated guest. So when the
> >> domain is teared down add FF-A requests to reclaim all remaining shared
> >> memory.
> >> SPs in the secure world are notified using VM_DESTROYED that a guest has
> >> been destroyed. An SP is supposed to relinquish all shared memory to allow
> >> reclaiming the memory. The relinquish operation may need to be delayed if
> >> the shared memory is for instance part of a DMA operation.
> >> If the FF-A memory reclaim request fails, return -ERESTART to retry
> >> again. This will effectively block the destruction of the guest until
> >> all memory has been reclaimed.
> >> Signed-off-by: Jens Wiklander <jens.wiklander@xxxxxxxxxx>
> >> ---
> >> Hi,
> >> This patch is a bit crude, but gets the job done. In a well designed
> >> system this might even be good enough since the SP or the secure world
> >> will let the memory be reclaimed and we can move on. But, if for some
> >> reason reclaiming the memory is refused it must not be possible to reuse
> >> the memory.
> >
> > IIUC, we are trying to harden against a buggy SP. Is that correct?
>
> This is not hardening as this is a possible scenario with a correctly 
> implemented SP.
> This is valid for the SP to not be able to relinquish the memory directly so 
> we have
> to take this possibility into account and retry.
>
> What is not expected if for the SP to never release the memory hence the 
> possible
> "todo" at the end of the code where i think we might have to implement a 
> counter
> to bound the possible number of loops but as always the question is how 
> many...
>
> In this case the only solution would be to park the memory as suggested after
> but we are not completely sure where hence the RFC.
>
> >
> >> These shared memory ranges are typically quite small compared to the
> >> total memory usage of a guest so it would be an improvement if only
> >> refused shared memory ranges where set aside from future reuse while the
> >> guest was destroyed and other resources made available for reuse. This
> >> could be done by for instance assign the refused shared memory ranges
> >> to a dummy VM like DOMID_IO.
> >
> > I like the idea to use a dummy VM, but I don't think DOMID_IO is right. 
> > Once teardown has completed, the domain will stay around until the last 
> > reference on all pages are dropped. At this point, the amount of memory 
> > left-over is minimum (this is mostly bookeeping in Xen).
> >
> > From the userland PoV, the domain will still show-up in the list but tools 
> > like "xl list" will show "(null)". They are called zombie domains.
> >
> > So I would consider to keep the same domain around. The advantage is you 
> > can call "xl destroy" again to retry the operation.
>
> In this scenario the "restart" implementation here is right but how could we 
> park the VM as "zombie" instead of busy looping in
> the "kill" loop of userland ?
>
> Also we need to release all the memory of the VM but the one shared with the 
> SP.
>
> I will let Jens answer the more implementation questions here after and try 
> to help on the more "system" ones.
>
> >
> >> Thanks,
> >> Jens
> >> ---
> >>  xen/arch/arm/tee/ffa.c | 36 ++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 36 insertions(+)
> >> diff --git a/xen/arch/arm/tee/ffa.c b/xen/arch/arm/tee/ffa.c
> >> index 183528d13388..9c596462a8a2 100644
> >> --- a/xen/arch/arm/tee/ffa.c
> >> +++ b/xen/arch/arm/tee/ffa.c
> >> @@ -1539,6 +1539,7 @@ static bool is_in_subscr_list(const uint16_t 
> >> *subscr, uint16_t start,
> >>  static int ffa_domain_teardown(struct domain *d)
> >>  {
> >>      struct ffa_ctx *ctx = d->arch.tee;
> >> +    struct ffa_shm_mem *shm, *tmp;
> >>      unsigned int n;
> >>      int32_t res;
> >>  @@ -1564,10 +1565,45 @@ static int ffa_domain_teardown(struct domain *d)
> >>              printk(XENLOG_ERR "ffa: Failed to report destruction of vm_id 
> >> %u to  %u: res %d\n",
> >>                     get_vm_id(d), subscr_vm_destroyed[n], res);
> >>      }
> >> +    /*
> >> +     * If this function is called again due to -ERESTART below, make sure
> >> +     * not to send the FFA_MSG_SEND_VM_DESTROYED's.
> >> +     */
> >> +    subscr_vm_destroyed_count = 0;
> >
> > AFAICT, this variable is global. So wouldn't you effectively break other 
> > domain if let say the unmapping error is temporary?
> >
> >>        if ( ctx->rx )
> >>          rxtx_unmap(ctx);
> >>  +
> >> +    list_for_each_entry_safe(shm, tmp, &ctx->shm_list, list)
> >> +    {
> >> +        register_t handle_hi;
> >> +        register_t handle_lo;
> >> +
> >> +        uint64_to_regpair(&handle_hi, &handle_lo, shm->handle);
> >> +        res = ffa_mem_reclaim(handle_lo, handle_hi, 0);
> >
> > Is this call expensive? If so, we may need to handle continuation here.
>
> This call should not be expensive in the normal case as memory is reclaimable
> so there is no processing required in the SP and all is done in the SPMC which
> should basically just return a yes or no depending on a state for the handle.

I agree, this should only be a thing between the hypervisor and the
SPMC in the secure world.

>
> So I think this is the best trade.
>
> @Jens: One thing to consider is that a Destroy might get a retry or busy 
> answer and we
> will have to issue it again and this is not considered in the current 
> implementation.

You're right, we'll need to keep track of which SPs we've been able to
send a VM_DESTROY message to.

>
> After discussing the subject internally we could in fact consider that if an 
> SP cannot release
> some memory shared with the VM destroyed, it should tell it by returning 
> "retry" to the message.
> Here that could simplify things by doing a strategy where:
> - we retry on the VM_DESTROY message if required

We should keep a record of which SPs remain to be signaled with
VM_DESTROY. An SP may have other reasons to return an error so this
call can be retried later.

> - if some memory is not reclaimable we check if we could park it and make the 
> VM a zombie.
> What do you think ?

The zombie option sounds like a good fallback when automatic reclaim
(reasonable timeouts have expired etc) has failed.

Thanks,
Jens

>
>
> >
> >> +        if ( res )
> >> +        {
> >> +            printk(XENLOG_INFO, "ffa: Failed to reclaim handle %#lx : 
> >> %d\n",
> >> +                   shm->handle, res);
> >
> > I think you want to use XENLOG_G_INFO to use the guest ratelimit. Also, I 
> > would suggest to print the domain ID in the logs (see '%pd').
> >
> >
> >> +        }
> >> +        else
> >> +        {
> >> +            printk(XENLOG_DEBUG, "ffa: Reclaimed handle %#lx\n", 
> >> shm->handle);
> >
> > Same here. You want to use XENLOG_G_DEBUG and print the domain ID.
> >
> >> +            ctx->shm_count--;
> >> +            list_del(&shm->list);
> >> +        }
> >> +    }
> >
> > NIT: New line here please for clarity.
> >
> >> +    if ( !list_empty(&ctx->shm_list) )
> >> +    {
> >> +        printk(XENLOG_INFO, "ffa: Remaining unclaimed handles, 
> >> retrying\n");
> >
> > Same as the other printks.
> >
> >> +        /*
> >> +         * TODO: add a timeout where we either panic or let the guest be
> >> +         * fully destroyed.
> >> +         */
> > Timeout with proper handling would be a solution. I am not sure about 
> > panic-ing. Do you think the TEE would be in a bad state if we can't release 
> > memory?
> >
> >> +        return -ERESTART;
> >> +    }
> >> +
> >>      XFREE(d->arch.tee);
> >>        return 0;
> >
> > Cheers,
> >
>
> Cheers
> Bertrand
>
> > --
> > Julien Grall
>
>



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.