[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list
On Tue, 2016-05-24 at 10:07 +0000, Wu, Feng wrote: > > See, for instance, cpu_disable_scheduler() in schedule.c. What we > > do is > > go over all the vcpus of all domains of either the system or the > > cpupool, and force the ones that we found with v->processor set to > > the > > pCPU that is going down, to perform migration (system_state will be > > different than SYS_STATE_suspend, and we hence take the 'else' > > branch). > > > > Considering that the pCPU is no longer part of the relevant > > bitmask-s > > during the migration, the vCPUs will figure out that they just > > can't > > stay there, and move somewhere else. > > Thanks a lot for the elaboration, it is really helpful. > NP :-) > > Note, however, that this happens for running and runnable vCPUs. > > I don't quite understand this, do you mean cpu_disable_scheduler() > only handle running and runnable vCPUs, I tried to find some hints > from the code, but I didn't get it. Could you please give some more > information about this? > It goes through all the vcpus of all domains, and does not check or care whether they are running, runnable or blocked. Let's look at this in some more details. So, let's assume that processor 5 is going away, and that you have the following vcpus around: d0v0 : v->processor = 5, running on cpu 5 d0v1 : v->processor = 4, running on cpu 4 d1v0 : v->processor = 5, runnable but not running d2v3 : v->processor = 5, blocked for d0v0, we do: cpu_disable_scheduler(5) set_bit(_VPF_migrating, d0v0->pause_flags); vcpu_sleep_nosync(d0v0); SCHED_OP(sleep, d0v0); csched_vcpu_sleep(d0v0) cpu_raise_softirq(5, SCHEDULE_SOFTIRQ); vcpu_migrate(d0v0); if ( v->is_running || ...) // assume v->is_running is true return ... ... <--- scheduling occurs on processor 5 ... context_saved(d0v0) vcpu_migrate(d0v0); //is_running is 0, so _VPF_migrating gets cleared vcpu_move_locked(d0v0, new_cpu); vcpu_wake(d0v0); SCHED_OP(wake, d0v0); csched_vcpu_wake(d0v0) __runq_insert(d0v0); __runq_tickle(d0v0); for d0v1, we do: cpu_disable_scheduler(5) if ( d0v1->processor != 5 ) continue for d1v0, we do: cpu_disable_scheduler(5) set_bit(_VPF_migrating, d1v0->pause_flags); vcpu_sleep_nosync(d1v0); SCHED_OP(sleep, d1v0); csched_vcpu_sleep(d1v0) __runq_remove(d1v0); vcpu_migrate(d1v0); if ( d1v0->is_running || !test_and_clear_bit(_VPF_migrating, d1v0->pause_flags) // false, but clears the _VPF_migrating flag vcpu_move_locked(d1v0, new_cpu); vcpu_wake(v); SCHED_OP(wake, d1v0); csched_vcpu_wake(d1v0) __runq_insert(d1v0); __runq_tickle(d1v0); for d2v3, we do: cpu_disable_scheduler(5) set_bit(_VPF_migrating, d2v3- >pause_flags); vcpu_sleep_nosync(d2v3); SCHED_OP(sleep, d2v3); csched_vcpu_sleep(d2v3) [1] // Nothing! vcpu_migrate(d2v3); if ( d2v3->is_running || !test_and_clear_bit(_VPF_migrating, d2v3->pause_flags) // false, but clears the _VPF_migrating flag [*] vcpu_move_locked(d2v3, new_cpu); vcpu_wake(d2v3); [2] // Nothing! > > If a > > vCPU is blocker, there is nothing to do, and in fact, nothing > > happens > > (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case). > > What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake() > are NOP '? > See [1] and [2] above. > > For > > those vCPUs, as soon as they wake up, they'll figure out that their > > own > > v->processor is not there any longer, and will move somewhere else. > > So basically, when vCPU is blocking, it has no impact to the blocking > vcpu > when 'v->processor' is removed. When the vCPU is waken up, it will > find > another pCPU to run, since the original 'v->processor' is down and no > longer in the cpu bitmask, right? > Yes, that was my point. _However_, as you can see at [*] above, it must be noted that even those vcpus that blocked while running on a certain processor (5 in the example), indeed have a chance to have their v->processor changed to something that is still online (something different than 5), as a consequence of that processor going away. Whether this is useful/enough for you, I can't tell right now, out of the top of my head. > > > > But this is not an issue in non pCPU hotplug scenario. > > > > > > It's probably an issue even if you remove a cpu from a cpupool (and > > even a more "interesting" one, if you also manage to add it to > > another > > pool, in the meantime) isn't it? > > Yes, things become more complex in that case .... > Well, but can you confirm that we also have an issue there, and test and report what happens if you move a cpu from pool A to pool B, while it still has vcpus from a domain that stays in pool A. If there's transient suboptimal behavior, well, we probably can live with that (depending on the specific characteristics of the transitory, I'd say). If things crash, we certainly want a fix! Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |