[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] move domain to cpupool0 before destroying it



On Tue, May 20, 2014 at 5:28 AM, Juergen Gross
<juergen.gross@xxxxxxxxxxxxxx> wrote:
> On 19.05.2014 18:19, George Dunlap wrote:
>>
>> On Mon, May 19, 2014 at 4:34 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>>>
>>>>>> On 19.05.14 at 16:57, <dunlapg@xxxxxxxxx> wrote:
>>>>
>>>> On Thu, May 15, 2014 at 5:59 AM, Juergen Gross
>>>> <juergen.gross@xxxxxxxxxxxxxx> wrote:
>>>>>
>>>>> Currently when a domain is destroyed it is removed from the domain_list
>>>>> before all of it's resources, including the cpupool membership, are
>>>>> freed.
>>>>> This can lead to a situation where the domain is still member of a
>>>>> cpupool
>>>>> without for_each_domain_in_cpupool() (or even for_each_domain()) being
>>>>> able to find it any more. This in turn can result in rejection of
>>>>> removing
>>>>> the last cpu from a cpupool, because there seems to be still a domain
>>>>> in
>>>>> the cpupool, even if it can't be found by scanning through all domains.
>>>>>
>>>>> This situation can be avoided by moving the domain to be destroyed to
>>>>> cpupool0 first and then remove it from this cpupool BEFORE deleting it
>>>>> from
>>>>> the domain_list. As cpupool0 is always active and a domain without any
>>>>
>>>> cpupool
>>>>>
>>>>> membership is implicitly regarded as belonging to cpupool0, this poses
>>>>> no
>>>>> problem.
>>>>
>>>>
>>>> I'm a bit unclear why we're doing *both* a sched_move_domain(), *and*
>>>> moving the "cpupool_rm_domain()".
>>>>
>>>> The sched_move_domain() only happens in domain_kill(), which is only
>>>> initiated (AFAICT) by hypercall: does that mean if a VM dies for some
>>>> other reason (i.e., crashes), that you may still have the same race?
>>>> If not, then just this change alone should be sufficent.  If it does,
>>>> then this change is redundant.
>>>
>>>
>>> No, a crashed domain is merely being reported as crashed to the
>>> tool stack. It's the tool stack to then actually invoke the killing of
>>> it (or else e.g. "on_crash=preserve" would be rather hard to handle).
>>
>>
>> Right, I see.
>>
>>>
>>>> Moving the cpupool_rm_domain() will change things so that there is now
>>>> a period of time where the VM is not being listed as being in
>>>> cpupool0's pool, but may still be in that pool's scheduler's list of
>>>> domains.  Is that OK?  If it is OK, it seems like that change alone
>>>> should be sufficient.
>>>
>>>
>>> Moving this earlier was a requirement to avoid the race that the
>>> earlier (much different) patch tried to address. Also I think the
>>> patch's description already addresses that question (see the last
>>> sentence of the quoted original mail contents above).
>>
>>
>> But we're avoiding that race by instead moving the dying domain to
>> cpupool0, which is never going to disappear.
>>
>> Or, moving the domain to cpupool0 *won't* sufficiently solve the race,
>> and this will -- in which case, why are we bothering to move it to
>> cpupool0 at all?  Why not just remove it from the cpupool when
>> removing it from the domain list?  Wouldn't that also solve the
>> original problem?
>>
>> Regarding the last bit, "...a domain without any cpupool membership is
>> implicitly regarded as belonging to cpupool0...":
>>
>> 1. At a quick glance through the code, I couldn't find any evidence
>> that this was the case; I couldn't find an instance where d->cpupool
>> == NULL => assumed cpupool0.
>
>
> xen/common/schedule.c:
>
> #define DOM2OP(_d)    (((_d)->cpupool == NULL) ? &ops :
> ((_d)->cpupool->sched))
>
> together with:
>
> struct scheduler *scheduler_get_default(void)
> {
>     return &ops;
>
> }
>
>>
>> 2. If in reality d->cpupool is never (or almost never) actually NULL,
>> then the "implicitly belongs to cpupool0" assumption will bitrot.
>> Having that kind of assumption without some way of making sure it's
>> maintained is a bug waiting to happen.
>
>
> That's not going to happen: This assumption is tested every time the idle
> domain is being referenced by the scheduler...

Great.  That's what I needed to know.

In that case:

Acked-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>

Thanks for putting up with my skepticism. :-)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.