[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC] xen: if on Xen, "flatten" the scheduling domain hierarchy
On 09/23/2015 05:36 AM, Juergen Gross wrote: > On 09/22/2015 06:22 PM, George Dunlap wrote: >> On 09/22/2015 05:42 AM, Juergen Gross wrote: >>> One other thing I just discovered: there are other consumers of the >>> topology sibling masks (e.g. topology_sibling_cpumask()) as well. >>> >>> I think we would want to avoid any optimizations based on those in >>> drivers as well, not only in the scheduler. >> >> I'm beginning to lose the thread of the discussion here a bit. >> >> Juergen / Dario, could one of you summarize your two approaches, and the >> (alleged) advantages and disadvantages of each one? > > Okay, I'll have a try: > > The problem we want to solve: > ----------------------------- > > The Linux kernel is gathering cpu topology data during boot via the > CPUID instruction on each processor coming online. This data is > primarily used in the scheduler to decide to which cpu a thread should > be migrated when this seems to be necessary. There are other users of > the topology information in the kernel (e.g. some drivers try to do > optimizations like core-specific queues/lists). > > When started in a virtualized environment the obtained data is next to > useless or even wrong, as it is reflecting only the status of the time > of booting the system. Scheduling of the (v)cpus done by the hypervisor > is changing the topology beneath the feet of the Linux kernel without > reflecting this in the gathered topology information. So any decisions > taken based on that data will be clueless and possibly just wrong. > > The minimal solution is to change the topology data in the kernel in a > way that all cpus are regarded as equal regarding their relation to each > other (e.g. when migrating a thread to another cpu no cpu is preferred > as a target). > > The topology information of the CPUID instruction is, however, even > accessible form user mode and might be used for licensing purposes of > any user program (e.g. by limiting the software to run on a specific > number of cores or sockets). So just mangling the data returned by > CPUID in the hypervisor seems not to be a general solution, while we > might want to do it at least optionally in the future. > > In the future we might want to support either dynamic topology updates > or be able to tell the kernel to use some of the topology data, e.g. > when pinning vcpus. > > > Solution 1 (Dario): > ------------------- > > Don't use the CPUID derived topology information in the Linux scheduler, > but let it use a simple "flat" topology by setting own scheduler domain > data under Xen. > > Advantages: > + very clean solution regarding the scheduler interface > + scheduler decisions are based on a minimal data set > + small patch > > Disadvantages: > - covers the scheduler only, drivers still use the "wrong" data > - a little bit hacky regarding some NUMA architectures (needs either a > hook in the code dealing with that architecture or multiple scheduler > domain data overwrites) > - future enhancements will make the solution less clean (either need > duplicating scheduler domain data or some new hooks in scheduler > domain interface) > > > Solution 2 (Juergen): > --------------------- > > When booted as a Xen guest modify the topology data built during boot > resulting in the same simple "flat" topology as in Dario's solution. > > Advantages: > + the simple topology is seen by all consumers of topology data as the > data itself is modified accordingly > + small patch > + future enhancements rather easy by selecting which data to modify > > Disadvantages: > - interface to scheduler not as clean as in Dario's approach > - scheduler decisions are based on multiple layers of topology data > where one layer would be enough to describe the topology > > > Dario, are you okay with this summary? Thanks -- that's very helpful. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |