[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] move domain to cpupool0 before destroying it
>>> On 19.05.14 at 16:57, <dunlapg@xxxxxxxxx> wrote: > On Thu, May 15, 2014 at 5:59 AM, Juergen Gross > <juergen.gross@xxxxxxxxxxxxxx> wrote: >> Currently when a domain is destroyed it is removed from the domain_list >> before all of it's resources, including the cpupool membership, are freed. >> This can lead to a situation where the domain is still member of a cpupool >> without for_each_domain_in_cpupool() (or even for_each_domain()) being >> able to find it any more. This in turn can result in rejection of removing >> the last cpu from a cpupool, because there seems to be still a domain in >> the cpupool, even if it can't be found by scanning through all domains. >> >> This situation can be avoided by moving the domain to be destroyed to >> cpupool0 first and then remove it from this cpupool BEFORE deleting it from >> the domain_list. As cpupool0 is always active and a domain without any > cpupool >> membership is implicitly regarded as belonging to cpupool0, this poses no >> problem. > > I'm a bit unclear why we're doing *both* a sched_move_domain(), *and* > moving the "cpupool_rm_domain()". > > The sched_move_domain() only happens in domain_kill(), which is only > initiated (AFAICT) by hypercall: does that mean if a VM dies for some > other reason (i.e., crashes), that you may still have the same race? > If not, then just this change alone should be sufficent. If it does, > then this change is redundant. No, a crashed domain is merely being reported as crashed to the tool stack. It's the tool stack to then actually invoke the killing of it (or else e.g. "on_crash=preserve" would be rather hard to handle). > Moving the cpupool_rm_domain() will change things so that there is now > a period of time where the VM is not being listed as being in > cpupool0's pool, but may still be in that pool's scheduler's list of > domains. Is that OK? If it is OK, it seems like that change alone > should be sufficient. Moving this earlier was a requirement to avoid the race that the earlier (much different) patch tried to address. Also I think the patch's description already addresses that question (see the last sentence of the quoted original mail contents above). > I've been trying to trace through the twisty little passages of domain > destruction, and I'm still not quite sure: would it be OK if we just > called sched_move_domain() in domain_destroy() instead of calling > cpupool_rm_domain()? No, it would not, because then again we wouldn't be able to deal with potential failure, needing re-invocation of the function. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |