[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/5] xen: credit2: never consider CPUs outside of our cpupool.



On 23/01/17 15:40, George Dunlap wrote:
> On Thu, Jan 19, 2017 at 8:08 AM, Juergen Gross <jgross@xxxxxxxx> wrote:
>> On 17/01/17 18:26, Dario Faggioli wrote:
>>> In fact, relying on the mask of what pCPUs belong to
>>> which Credit2 runqueue is not enough. If we only do that,
>>> when Credit2 is the boot scheduler, we may ASSERT() or
>>> panic when moving a pCPU from Pool-0 to another cpupool.
>>>
>>> This is because pCPUs outside of any pool are considered
>>> part of cpupool0. This puts us at risk of crash when those
>>> same pCPUs are added to another pool and something
>>> different than the idle domain is found to be running
>>> on them.
>>>
>>> Note that, even if we prevent the above to happen (which
>>> is the purpose of this patch), this is still pretty bad,
>>> in fact, when we remove a pCPU from Pool-0:
>>> - in Credit1, as we do *not* update prv->ncpus and
>>>   prv->credit, which means we're considering the wrong
>>>   total credits when doing accounting;
>>> - in Credit2, the pCPU remains part of one runqueue,
>>>   and is hence at least considered during load balancing,
>>>   even if no vCPU should really run there.
>>>
>>> In Credit1, this "only" causes skewed accounting and
>>> no crashes because there is a lot of `cpumask_and`ing
>>> going on with the cpumask of the domains' cpupool
>>> (which, BTW, comes at a price).
>>>
>>> A quick and not to involved (and easily backportable)
>>> solution for Credit2, is to do exactly the same.
>>>
>>> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx
>>> ---
>>> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
>>> Cc: Juergen Gross <jgross@xxxxxxxx>
>>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
>>> ---
>>> This is a bugfix, and should be backported to 4.8.
>>> ---
>>> The proper solution would mean calling deinit_pdata() when removing a pCPU 
>>> from
>>> cpupool0, as well as a bit more of code reshuffling.
>>>
>>> And, although worth doing, it certainly will take more work, more time, and
>>> will probably be hard (well, surely harder than this) to backport.
>>>
>>> Therefore, I'd argue in favor of both taking and backporting this change, 
>>> which
>>> at least enables using Credit2 as default scheduler without risking a crash
>>> when creating a second cpupool.
>>>
>>> Afterwards, a proper solution would be proposed for Xen 4.9.
>>>
>>> Finally, given the wide number of issues similar to this that I've found and
>>> fixed in the last release cycle, I think it would be good to take a stab at
>>> whether the interface between cpupools and the schedulers could not be
>>> simplified. :-O
>>
>> The first patch version for support of cpupools I sent used an own
>> scheduler instance for the free cpus. Keir merged this instance with
>> the one for Pool-0.
> 
> You mean he just made the change unilaterally without posting it to
> the list?  I just went back and skimmed through the old cpupools
> threads and didn't see this discussed.

I'm not sure I remember everything correctly. Unfortunately I don't
have a copy of all emails back from then as this work was done for
another employer.

Looking into the archives it seems the switch was done already in the
final version I sent to the list. I believe Keir did some testing based
on my first series and suggested some modifications which I happily
accepted.

> Having a "cpupool-remove" operation that doesn't actually remove the
> cpu from the pool is a bit mad...

Logically it does remove the cpu from Pool-0. It is just that there is
no other scheduler entity involved for doing so.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.