Re: [Xen-devel] schedulers and topology exposing questions

On Fri, 2016-01-22 at 11:54 -0500, Elena Ufimtseva wrote:
> Hello all!

> Let me put some intro to our findings. I may forget something or put
> something
> not too explicit, please ask me.
> Customer filled a bug where some of the applications were running
> slow in their HVM DomU setups.
> These running times were compared against baremetal running same
> kernel version as HVM DomU.
> After some investigation by different parties, the test case scenario
> was found
> where the problem was easily seen. The test app is a udp
> server/client pair where
> client passes some message n number of times.
> The test case was executed on baremetal and Xen DomU with kernel
> version 2.6.39.
> Bare metal showed 2x times better result that DomU.
> Konrad came up with a workaround that was setting the flag for domain
> scheduler in linux
> As the guest is not aware of SMT-related topology, it has a flat
> topology initialized.
> Kernel has domain scheduler flags for scheduling domain CPU set to
> 4143 for 2.6.39.
> Konrad discovered that changing the flag for CPU sched domain to 4655
> works as a workaround and makes Linux think that the topology has SMT
> threads.
> This workaround makes the test to complete almost in same time as on
> baremetal (or insignificantly worse).
> This workaround is not suitable for kernels of higher versions as we
> discovered.
> The hackish way of making domU linux think that it has SMT threads
> (along with matching cpuid)
> made us thinks that the problem comes from the fact that cpu topology
> is not exposed to
> guest and Linux scheduler cannot make intelligent decision on
> scheduling.
So, me an Juergen (from SuSE) have been working on this for a while

As far as my experiments goes, there are at least two different issues,
both traceable to Linux's scheduler behavior. One has to do with what
you just say, i.e., topology.

Juergen has developed a set of patches, and I'm running benchamrks with
them applied to both Dom0 and DomU, to see how they work.

I'm not far from finishing running a set of 324 different test cases
(each one run both without and with Juergen's patches). I am running
different benchamrks, such as:
Â- iperf,
Â- a Xen build,
Â- sysbench --oltp,
Â- sysbench --cpu,
Â- unixbench

and I'm also varying how loaded the host is, how big the VMs are, and
how loaded the VMs are.

324 is the result of various combinations of the above... It's quite an
extensive set! :-P

As soon as everything finishes running, I'll data mine the results, and
let you know how they look like.

The other issue that I've observed is that tweaking some _non_ topology
related scheduling domains' flags also impact performance, sometimes in
a quite sensible way.

I have got the results from the 324 test cases described above of
running with flags set to 4131 inside all the DomUs. That value was
chosen after quite a bit of preliminary benchmarking and investigation
as well.

I'll share the results of that data set as well as soon as I manage to
extract them from the raw output.

> Joao Martins from Oracle developed set of patches that fixed the
> smt/core/cashe
> topology numbering and provided matching pinning of vcpus and
> enabling options,
> allows to expose to guest correct topology.
> I guess Joao will be posting it at some point.
That is one way of approaching the topology issue. The other, which is
what me and Juergen are pursuing, is the opposite one, i.e., make the
DomU (and Dom0, actually) think that the topology is always completely

I think, ideally, we want both: flat topology as the default, if no
pinning is specifying. Matching topology if it is.

> With this patches we decided to test the performance impact on
> different kernel versionand Xen versions.
That is really interesting, and thanks a lot for sharing it with us.

I'm in the middle of something here, so I just wanted to quickly let
you know that we're also working on something related... I'll have a
look at the rest of the email and at the graphs ASAP.

Thanks again and Regards,
<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

