[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC XEN PATCH] xen/arm: ffa: reclaim shared memory on guest destroy
Hi Andrew, On Tue, Dec 5, 2023 at 11:53 AM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > > On 05/12/2023 8:14 am, Bertrand Marquis wrote: > > Hi Julien, > > > > Thanks a lot for your review and comment, this is very helpful. > > > >> On 4 Dec 2023, at 20:24, Julien Grall <julien@xxxxxxx> wrote: > >> > >> Hi Jens, > >> > >> On 04/12/2023 07:55, Jens Wiklander wrote: > >>> if ( ctx->rx ) > >>> rxtx_unmap(ctx); > >>> + > >>> + list_for_each_entry_safe(shm, tmp, &ctx->shm_list, list) > >>> + { > >>> + register_t handle_hi; > >>> + register_t handle_lo; > >>> + > >>> + uint64_to_regpair(&handle_hi, &handle_lo, shm->handle); > >>> + res = ffa_mem_reclaim(handle_lo, handle_hi, 0); > >> Is this call expensive? If so, we may need to handle continuation here. > > This call should not be expensive in the normal case as memory is > > reclaimable > > so there is no processing required in the SP and all is done in the SPMC > > which > > should basically just return a yes or no depending on a state for the > > handle. > > > > So I think this is the best trade. > > > > @Jens: One thing to consider is that a Destroy might get a retry or busy > > answer and we > > will have to issue it again and this is not considered in the current > > implementation. > > > > After discussing the subject internally we could in fact consider that if > > an SP cannot release > > some memory shared with the VM destroyed, it should tell it by returning > > "retry" to the message. > > Here that could simplify things by doing a strategy where: > > - we retry on the VM_DESTROY message if required > > - if some memory is not reclaimable we check if we could park it and make > > the VM a zombie. > > What do you think ? > > This is the cleanup issue discussed at XenSummit, isn't it? > > You cannot feasibly implement this cleanup by having > ffa_domain_teardown() return -ERESTART. > > Yes, it will probably function - but now you're now bouncing in/out of > Xen as fast as the CPU will allow, rechecking a condition which will > take an unbounded quantity of time. Meanwhile, you've tied up a > userspace thread (the invocation of `xl destroy`) to do so, and one of > dom0's vCPU for however long the scheduler is willing to schedule the > destroy invocation, which will be 100% of the time as it's always busy > in the hypervisor. > > The teardown/kill infrastructure is intended and expected to always make > forward progress. > OK > > The closest thing to this patch which will work sanely is this: > > Hold a single domain reference for any non-zero amount of magic memory > held. See domain_adjust_tot_pages() and how it interacts with > {get,put}_domain(), and copy it. Importantly, this prevents the domain > being freed until the final piece of magic memory has been released. > > Have some way (can be early on the teardown/kill path, or a separate > hypercall - assuming the VM can't ever be scheduled again) to kick Xen > into being responsible for trying to reclaim the memory. (Start a > timer, or reclaim in the idle loop, whatever.) > > This way, you can `xl destroy` a VM in an arbitrary state, *and* the > invocation will terminate when Xen has nothing deterministic left to do, > *and* in the case that the secure world or Xen has an issue, the VM will > stay around as a zombie holding minimal resources. Thanks for the pointers, very helpful, and now I at least know where to start looking. Cheers, Jens > > ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |