[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 4/8] x86/mem-sharing: copy GADDR based shared guest areas





On Thu, May 4, 2023 at 3:44 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 03.05.2023 19:14, Tamas K Lengyel wrote:
> >> @@ -1974,7 +2046,10 @@ int mem_sharing_fork_reset(struct domain
> >>
> >>   state:
> >>      if ( reset_state )
> >> +    {
> >>          rc = copy_settings(d, pd);
> >> +        /* TBD: What to do here with -ERESTART? */
> >
> > Ideally we could avoid hitting code-paths that are restartable during fork
> > reset since it gets called from vm_event replies that have no concept of
> > handling errors. If we start having errors like this we would just have to
> > drop the vm_event reply optimization and issue a standalone fork reset
> > hypercall every time which isn't a big deal, it's just slower.
>
> I'm afraid I don't follow: We are in the process of fork-reset here. How
> would issuing "a standalone fork reset hypercall every time" make this
> any different? The possible need for a continuation here comes from a
> failed spin_trylock() in map_guest_area(). That won't change the next
> time round.

Why not? Who is holding the lock and why wouldn't it ever relinquish it? If that's really true then there is a larger issue then just not being able to report the error back to the user on vm_event_resume path and we need to devise a way of being able to copy this from the parent bypassing this lock. The parent is paused and its state should not be changing while forks are active, so if the lock was turned into an rwlock of some sort so we can acquire the read-lock would perhaps be a possible way out of this.

>
> But perhaps I should say that till now I didn't even pay much attention
> to the 2nd use of the function by vm_event_resume(); I was mainly
> focused on the one from XENMEM_sharing_op_fork_reset, where no
> continuation handling exists. Yet perhaps your comment is mainly
> related to that use?
>
> I actually notice that the comment ahead of the function already has a
> continuation related TODO, just that there thought is only of larger
> memory footprint.

With XENMEM_sharing_op_fork_reset the caller actually receives the error code and can decide what to do next. With vm_event_resume there is no path currently to notify the agent of an error. We could generate another vm_event to send such an error, but the expectation with fork_reset is that it will always work because the parent is paused, so not having that path for an error to get back to the agent isn't a big deal.

Now, if it becomes the case that due to this locking we can get an error even while the parent is paused, that will render the vm_event_resume path unreliably so we would just switch to using XENMEM_sharing_op_fork_reset so that at least it can re-try in case of an issue. Of course, only if a reissue of the hypercall has any reasonable chance of succeeding.

>
> > My
> > preference would actually be that after the initial forking is performed a
> > local copy of the parent's state is maintained for the fork to reset to so
> > there would be no case of hitting an error like this. It would also allow
> > us in the future to unpause the parent for example..
>
> Oh, I guess I didn't realize the parent was kept paused. Such state
> copying / caching may then indeed be a possibility, but that's nothing
> I can see myself deal with, even less so in the context of this series.
> I need a solution to the problem at hand within the scope of what is
> there right now (or based on what could be provided e.g. by you within
> the foreseeable future). Bubbling up the need for continuation from the
> XENMEM_sharing_op_fork_reset path is the most I could see me handle
> myself ... For vm_event_resume() bubbling state up the domctl path
> _may_ also be doable, but mem_sharing_notification() and friends don't
> even check the function's return value.

Sure, I wasn't expecting that work to be done as part of this series, just as something I would like to get to in the future.

Tamas

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.