[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH 2/3] xen: Refactor migration
On Wed, 2018-04-11 at 13:25 +0100, George Dunlap wrote: > The current sequence to initiate vcpu migration is inefficent and > error-prone: > > - The initiator sets VPF_migraging with the lock held, then drops the > lock and calls vcpu_sleep_nosync(), which immediately grabs the > lock > again > > - A number of places unnecessarily check for v->pause_flags in > between > those two > > - Every call to vcpu_migrate() must be prefaced with > vcpu_sleep_nosync() or introduce a race condition; this code > duplication is error-prone > It's not only error prone, it's the cause of actual bugs! :-P That's why I would also change the subject of the series into something like "fix races happening during vcpu migration" > - In the event that v->is_running is true at the beginning of > vcpu_migrate(), it's almost certain that vcpu_migrate() will end up > being called in context_switch() as well; we might as well simply > let it run there and save the duplicated effort (which will be > non-negligible). > > Instead, introduce vcpu_migrate_start() to initiate the process. > vcpu_migrate_start() is called with the scheduling lock held. It not > only sets VPF_migrating, but also calls vcpu_sleep_nosync_locked() > (which will automatically do nothing if there's nothing to do). > > Rename vcpu_migrate() to vcpu_migrate_finish(). Check for v- > >is_running and > pause_flags & VPF_migrating at the top and return if appropriate. > > Then the way to initiate migration is consistently: > > * Grab lock > * vcpu_migrate_start() > * Release lock > * vcpu_migrate_finish() > > Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxx> > --- a/xen/common/schedule.c > +++ b/xen/common/schedule.c > @@ -593,13 +593,33 @@ static void vcpu_move_nosched(struct vcpu *v, > unsigned int new_cpu) > sched_move_irqs(v); > } > > -static void vcpu_migrate(struct vcpu *v) > +/* > + * Initiating migration > + * > + * In order to migrate, we need the vcpu in question to have stopped > + * running and be taken off the runqueues; we also need to hold the > lock > + */ > Perhaps "and SCHED_OP(sleep) to have been called on it", instead of "and be taken off the runqueues", if we don't want to even mention runqueues in schedule.c. And there are a couple of instances of whitespace damaging. Nevertheless, Reviewed-by: Dario Faggioli <dfaggioli@xxxxxxxx> And I guess we can add a 'Tested-by: Olaf Hering', as he actually did that, what do you say Olaf? Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Software Engineer @ SUSE https://www.suse.com/ Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |