[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH 1/2] xen: fix a (latent) cpupool-related race during domain destroy
So, during domain destruction, we do: cpupool_rm_domain() [ in domain_destroy() ] sched_destroy_domain() [ in complete_domain_destroy() ] Therefore, there's a window during which, from the scheduler's point of view, a domain is still there, but without it being part of any cpupool. In fact, cpupool_rm_domain() does d->cpupool=NULL, and we don't allow anything like that to hold, for any domain with the only exception of the idle one. And if we stuble upon such a domain, there are ASSERT()s and BUG_ON()s that do trigger. This never happens, right now, but only because none of the functions containing one of those sanity checks are called during the above described window. However, Credit2 goes (during load balancing) through the list of domains assigned to a certain runqueue, and not only the ones that are running or runnable. If one of those domains had cpupool_rm_domain() called upon itself already, and we call one of those functions which checks for d->cpupool!=NULL... Boom! For example, while doing Credit2 development, calling something similar to __vcpu_has_soft_affinity() from balance_load(), makes `xl shutdown <domid>' reliably crash (this is how I discovered this). On the other hand, cpupool_rm_domain() "only" does cpupool related bookkeeping, and there's no harm postponing it a little bit. Finally, considering that, during domain initialization, we do: cpupool_add_domain() sched_init_domain() It looks like it makes much more sense for the domain destroy path to look like the opposite of it, i.e.: sched_destroy_domain() cpupool_rm_domain() This patch does that, and it's considered worth, as it fixes a bug, even if only a latent one. Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx> --- Cc: Juergen Gross <jgross@xxxxxxxx> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> Cc: Jan Beulich <jbeulich@xxxxxxxx> --- xen/common/domain.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/common/domain.c b/xen/common/domain.c index 42c07ee..f8096d3 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -811,6 +811,8 @@ static void complete_domain_destroy(struct rcu_head *head) destroy_waitqueue_vcpu(v); } + cpupool_rm_domain(d); + grant_table_destroy(d); arch_domain_destroy(d); @@ -868,8 +870,6 @@ void domain_destroy(struct domain *d) TRACE_1D(TRC_DOM0_DOM_REM, d->domain_id); - cpupool_rm_domain(d); - /* Delete from task list and task hashtable. */ spin_lock(&domlist_update_lock); pd = &domain_list; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |