[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] cpupools: retry cpupool-destroy if domain in cpupool is dying
At 10:50 +0100 on 14 May (1400061034), George Dunlap wrote: > On Wed, May 14, 2014 at 10:48 AM, George Dunlap > <George.Dunlap@xxxxxxxxxxxxx> wrote: > > On Wed, May 14, 2014 at 10:16 AM, George Dunlap > > <George.Dunlap@xxxxxxxxxxxxx> wrote: > >> On Mon, May 12, 2014 at 12:49 PM, Juergen Gross > >> <juergen.gross@xxxxxxxxxxxxxx> wrote: > >>> When a cpupool is destroyed just after the last domain has been stopped > >>> the > >>> domain might already be removed from the domain list without being removed > >>> from the cpupool. > >>> It is easy to detect this situation and to return EAGAIN in this case > >>> which > >>> is already handled in libxc by doing a retry. > >> > >> OK, I hate to be picky over two lines, but it still seems to me like > >> this is papering over issues instead of dealing with them properly. > >> The real problem here is that "for_each_domain_in_cpupool()" doesn't > >> actually go over every domain in the cpupool. Instead of making it so > >> that it actually does, you're compensating for that fact in an ad-hoc > >> fashion. > >> > >> Now as it happens, it looks like all the other current uses of > >> for_each_domain_in_cpupool() work just fine if there are domains in > >> the pool it doesn't see, as long as they're about to disappear. But > >> we've already seen a bug caused because of a situation where "don't > >> see domains that are about to disappear" *does* actually cause a > >> problem; working around it is just setting a trap for future > >> developers to fall into. (And who knows, there may already be a bug > >> we haven't discovered in the other invocations of > >> for_each_domain_in_cpupool()). > > > > Really this seems like a race in our rcu implementation wrt the domain > > list. It seems like ideally, if you grab the domlist_read_lock, you > > should either get the domain on the list, or the domain off the list > > *and* complete_domain_destroy() completed... I don't think that's somethng that can be done with RCU. The guarantee you get as a reader is that if you _do_ see the domain on the list, complete_domain_destroy() _hasn't_ been called (and in particular the domain struct hasn't been freed). To guarantee that any domain you _don't_ see _has_ been destroyed would need a full mutex that the caller of complete_domain_destroy() could hold to exclude you. Tim. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |