[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/sched: fix credit2 smt idle handling



On Fri, 2019-03-22 at 10:04 +0100, Juergen Gross wrote:
> Credit2's smt_idle_mask_set() and smt_idle_mask_clear() are used to
> identify idle cores where vcpus can be moved to. A core is thought to
> be idle when all siblings are known to have the idle vcpu running on
> them.
> 
> Unfortunately the information of a vcpu running on a cpu is per
> runqueue. So in case not all siblings are in the same runqueue a core
> will never be regarded to be idle, as the sibling not in the runqueue
> is never known to run the idle vcpu.
> 
Good catch.

I apparently forgot to take care of this, when introduced the
possibility of having per single CPU runqueue (which, in an SMT enabled
system, would mean per-thread runqs).

> This problem can be solved by and-ing the core's sibling cpumask with
> the runqueue's active mask before doing the idle test.
> 
Right. There's one thing, though. Using one runq per CPU, in this
scheduler, is a really poor choice, and I basically would recommend it
only for testing or debugging (and this should probably be highlighted
a lot better in the docs).

Therefore, I'm a bit reluctant at adding another cpumask bitwise
operation, in hot paths, just for taking care of it.

Note that this also applies to cpupools, i.e., I also consider a very
poor choice putting two sibling hyperthreads in different pools. If you
recall, I even sent a patch to forbid doing that (which is still
blocked on a series of yours for passing parameters from the tools to
the hypervisor).

The only case I care, is a CPU being off-lined.

So, one thing that we could do is to put credit2_runqueue=cpu inside
such #ifdef-s too (and I can prepare a patch to that effect myself, if
you want).

To properly deal with offline CPUs, I think we can change the logic a
little, i.e., we initialize the smt_idle mask to all-1 (all CPUs idle),
and we also make sure that we set the CPU bit (instead of learing it)
in smt_idle, when we remove the CPU from the scheduler.

Does doing these things (negatively) affect the other patch series
you're working on?

> In order for not having to allocate another cpumask the interfaces of
> smt_idle_mask_set() and smt_idle_mask_clear() are modified to not
> take
> a mask as input, but the runqueue data pointer, as those functions
> are
> always called with the same masks as parameters.
> 
The interface change, I'm happy with it.

Oh, and the other patches you sent, as you know, I've been travelling
this week, so I couldn't look at them.

I will let you have my comment ASAP next week.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.