[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [Hackathon 2016] Scheduling session minutes
Hi everyone, I still owe this list the minutes of the session we had at the hackathon about scheduling. It was a round table with status updates being given on ongoing activities, as well as a few ideas for future work/improvements being tossed, so nothing too structured, but here we go. I did not write down who was there, so I tried to Cc everyone that I mentioned, or that I think would be interested, sorry if I missed anyone. Credit2 ======= It's some time that we say we want it to be no longer experimental and, eventually, be default. Now we are close, but there are still gaps to be closed, such as: - soft-affinity still missing [patches on list from a student, Dario working on refreshing them] - caps feature still missing [dario working on it] --> ACTION for Dario and George to finish the code. XenServer people will help when it will come with testing benchmarking vNUMA ===== And it's scheduling implications, like enabling in-guest NUMA scheduling optimization. Status is, most of it is there, but for it to function on PV guests (which includes dom0) we need what will be the followup of Andrew's work on CPUID (he had a session about such work at the hackathon as well). As soon as this will land, it would unblock what's called "backend locality". It basically means, on large NUMA and IONUMA boxes, a more intelligent placement of backend components (e.g., kernel threads and processes in dom0), improving how efficiently we will exploit the memory and IO bandwidth. This was discussed at a previous hackathon (in Dublin), and still hasn't happened. George still thinks this would be a good idea. Dario agrees. Unclear whether there will be community/companies interest in having it (which means helping make it happen, testing it, using it if it works well, etc). Citrix XenServer people said they think some colleague of them did some preliminary measurements, and it did not look too promising, but hard to tell, since pieces are not there yet. SuSE people said the may be interested at least in giving it a try, as they have hardware where this could be useful. ---> ACTION Dario to rehash the discussion when pieces are there XEN & GUEST SCHEDULING INTERACTIONS =================================== Some analysis has been done, recently, to see whether we have issues due to Xen's and Linux's scheduler interacting badly (Dario and Oracle people, independently, have seen something like that). For example what if, as far as Linux knows, moving a task from (the vcpu running on) CPU A to (the vcpu running on) CPU B is cheap at a certain point in time, but then, when the move actually happens, the Xen scheduler has moved the vcpu that was running on CPU A and the one running on CPU B on two other far away CPUs? Dario has done some benchmarks, playing with the flags that control the load balancer in Linux. Result is that this inter-scheduler interaction plays a role, but it is very difficult to generalize and figure out in what direction to move (if at all), as results are very much workload dependant. Matt said he'd be interested in seeing the results, and maybe help on figuring things out. Juergen has a patch that makes the guest topology completely flat, so the Linux scheduler would not waste time doing its fancy SMT/core/whatever load balancing logic... and we say it "wastes time" because all this logic is based on both wrong and unreliable topology information. Dario has run some benchmarks on these patches. Results show some improvements, but not as much as expected. One possible reason is guests were too small, and they were always fully loaded. What we want is see the numbers in cases where the guests are not fully booked, so that the guests' scheduler will have to chime in and make decisions, and we expect them to be bad. Matt expressed again his interest on this activity, and said that he concurs to approaches that tries to make the guest scheduler 'dumber'. ---> ACTION Dario to repeat the benchmark in the new configuration ACTION Dario to post all the numbers, so others can see and think about them POWER AWARE SCHEDULING ====================== There was a question (Luis, IIRC) about whether we do any power aware scheduling in Xen. Dario noted that, in Credit1, we do, but very lightly (there's a packing versus spreading the workload flag). George pointed that Credit2 has been designed with that (among other things) in mind, especially the load balancer. There is nothing about it currently, but the Credit2 load balancer is all built upon a function that computes the merit of doing a certain balancing operation, and the logic for coming up with a result is pluggable by design, and can hence accommodate load balancing considerations. ARM people expressed interest in having bigLITTLE support in th Xen scheduler at some point, and asked about a sensible way of using cpupools for that. Dario said this should be doable with some low-to-moderate amount of work, and that it makes sense. Juergen and George noted that it may be better (and easier) to exploit vcpu affinity to do that. ---> ACTION xxx GANG-SCHEDULING =============== At the previous XenProject developper meeting Co-scheduling (aka Gang scheduling) was mentioned and deemed interested for being investigated. Dario reported having done some searching and thinking, and it indeed looks a nice feature to offer, but it's hard to figure out whether it will be really useful and adopted, as it also introduces limitations. Juergen noted that it would be interesting to try a very special form of Gang-scheduling, such as when a vcpu of VM A is scheduled on core x, and if core y is SMT sibling of x, prefer scheduling on y another vcpu of VM A, rather than any other vcpu of any other VM. This would improve precision of accounting, and reduce the scope for side channels attacks exploiting the siblings' shared caches. Dario found this idea really really interesting ---> ACTION Dario to look into this, but not immediately SCHEDULING TESTING / BENCHMARKING ================================= George talked about how difficult it is to benchmark a scheduler. In fact, you need high degree of CPU competition, a combination of workloads to be run inside the VMs with load that varies over time but, at the same time, is reproducible (because if you find a 'bug' you want to reproduce it!) Dario mentioned the difficulty of coming up with workload and benchmarks that are representative of real users and customers use case and scenario. We would need feedback from both users and companies' customers for that. Everyone agreed that it is very valuable but also very very hard to obtain such kind of feedback. Citrix has a huge test farm for XenServer, and they will see about running more scheduling related testing. Dario has a plan to add at least some "basic" form of performance regression testing to OSSTest. ---> ACTION Dario to add performance regression testing to OSSTest Thanks to everyone that attended. If I got something wrong, inaccurate or if I missed anything, feel free to point out :-) Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |