[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] cpupools: retry cpupool-destroy if domain in cpupool is dying
On 05/14/2014 11:15 AM, Jan Beulich wrote: On 14.05.14 at 11:56, <juergen.gross@xxxxxxxxxxxxxx> wrote:On 14.05.2014 11:16, George Dunlap wrote:On Mon, May 12, 2014 at 12:49 PM, Juergen Gross <juergen.gross@xxxxxxxxxxxxxx> wrote:When a cpupool is destroyed just after the last domain has been stopped the domain might already be removed from the domain list without being removed from the cpupool. It is easy to detect this situation and to return EAGAIN in this case which is already handled in libxc by doing a retry.OK, I hate to be picky over two lines, but it still seems to me like this is papering over issues instead of dealing with them properly. The real problem here is that "for_each_domain_in_cpupool()" doesn't actually go over every domain in the cpupool. Instead of making it so that it actually does, you're compensating for that fact in an ad-hoc fashion. Now as it happens, it looks like all the other current uses of for_each_domain_in_cpupool() work just fine if there are domains in the pool it doesn't see, as long as they're about to disappear. But we've already seen a bug caused because of a situation where "don't see domains that are about to disappear" *does* actually cause a problem; working around it is just setting a trap for future developers to fall into. (And who knows, there may already be a bug we haven't discovered in the other invocations of for_each_domain_in_cpupool()).This isn't unique to for_each_domain_in_cpupool(). It is a problem for all uses of for_each_domain() which are related to resources freed only in complete_domain_destroy().Of which there shouldn't be that many, if any at all. What prevents cpupool_rm_domain() getting moved from complete_domain_destroy() to domain_destroy(), before the domain gets taken off the list? I actually assume that there are more things here that may not really need deferring until the last possible moment... Well the domain needs to be taken out of the scheduler for one, which is done by sched_destroy_domain(). That's where I stopped the "dependency tracking", as it seemed somewhat likely that somewhere there would be things that depended on having a valid scheduler. (That may not be correct, but it would take a careful reading to convince yourself it was true.) -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |