[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/6] xen: credit1: increase efficiency and scalability of load balancing.



On Thu, 2017-03-02 at 11:06 +0000, Andrew Cooper wrote:
> On 02/03/17 10:38, Dario Faggioli wrote:
> > 
> > To mitigate this, we introduce here the concept of
> > overloaded runqueues, and a cpumask where to record
> > what pCPUs are in such state.
> > 
> > An overloaded runqueue has at least runnable 2 vCPUs
> > (plus the idle one, which is always there). Typically,
> > this means 1 vCPU is running, and 1 is sitting in  the
> > runqueue, and can hence be stolen.
> > 
> > Then, in  csched_balance_load(), it is enough to go
> > over the overloaded pCPUs, instead than all non-idle
> > pCPUs, which is better.
> > 
> Malcolm’s solution to this problem is https://github.com/xenserver/xe
> n-4.7.pg/commit/0f830b9f229fa6472accc9630ad16cfa42258966  This has
> been in 2 releases of XenServer now, and has a very visible
> improvement for aggregate multi-queue multi-vm intrahost network
> performance (although I can't find the numbers right now).
> 
> The root of the performance problems is that pcpu_schedule_trylock()
> is expensive even for the local case, while cross-cpu locking is much
> worse.  Locking every single pcpu in turn is terribly expensive, in
> terms of hot cacheline pingpong, and the lock is frequently
> contended.
> 
BTW, both my patch in this series, and the patch linked above are
_wrong_ in using __runq_insert() and __runq_remove() for counting the
runnable vCPUs.

In fact, in Credit1, during the main scheduling function
(csched_schedule()), we call runqueue insert for temporarily putting
the running vCPU. This increments the counter, making all the other
pCPUs think that there is a vCPU available for stealing in there, while
that:
1) may not be true, if we end up choosing to run the same vCPU again
2) even if true, they'll always fail on the trylock, unless until we're
out of csched_schedule(), as it holds the runqueue lock itself.

So, yeah, it's not really a matter of correctness, but there's more
overhead cut.

In v2 of this series, that I'm about to send, I've "fixed" this (i.e.,
I'm only modifying the counter when really necessary).

> As a first opinion of this patch, you are adding another cpumask
> which is going to play hot cacheline pingpong.
> 
Yeah, well, despite liking the cpumask based approach, I agree it's
overkill in this case. In v2, I got rid of it, and I am doing something
 even more similar to Malcolm's patch above.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.