[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 2/3] xen: Refactor migration

On Wed, 2018-04-11 at 13:25 +0100, George Dunlap wrote:
> The current sequence to initiate vcpu migration is inefficent and
> error-prone:
> - The initiator sets VPF_migraging with the lock held, then drops the
>   lock and calls vcpu_sleep_nosync(), which immediately grabs the
> lock
>   again
> - A number of places unnecessarily check for v->pause_flags in
> between
>   those two
> - Every call to vcpu_migrate() must be prefaced with
>   vcpu_sleep_nosync() or introduce a race condition; this code
>   duplication is error-prone
It's not only error prone, it's the cause of actual bugs! :-P

That's why I would also change the subject of the series into something
like "fix races happening during vcpu migration"

> - In the event that v->is_running is true at the beginning of
>   vcpu_migrate(), it's almost certain that vcpu_migrate() will end up
>   being called in context_switch() as well; we might as well simply
>   let it run there and save the duplicated effort (which will be
>   non-negligible).
> Instead, introduce vcpu_migrate_start() to initiate the process.
> vcpu_migrate_start() is called with the scheduling lock held.  It not
> only sets VPF_migrating, but also calls vcpu_sleep_nosync_locked()
> (which will automatically do nothing if there's nothing to do).
> Rename vcpu_migrate() to vcpu_migrate_finish().  Check for v-
> >is_running and
> pause_flags & VPF_migrating at the top and return if appropriate.
> Then the way to initiate migration is consistently:
> * Grab lock
> * vcpu_migrate_start()
> * Release lock
> * vcpu_migrate_finish()
> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxx>

> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -593,13 +593,33 @@ static void vcpu_move_nosched(struct vcpu *v,
> unsigned int new_cpu)
>      sched_move_irqs(v);
>  }
> -static void vcpu_migrate(struct vcpu *v)
> +/*
> + * Initiating migration
> + * 
> + * In order to migrate, we need the vcpu in question to have stopped
> + * running and be taken off the runqueues; we also need to hold the
> lock 
> + */
Perhaps "and SCHED_OP(sleep) to have been called on it", instead of
"and be taken off the runqueues", if we don't want to even mention
runqueues in schedule.c.

And there are a couple of instances of whitespace damaging.


Reviewed-by: Dario Faggioli <dfaggioli@xxxxxxxx>

And I guess we can add a 'Tested-by: Olaf Hering', as he actually did
that, what do you say Olaf?

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.