[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1 of 3] xen: sched_credit, improve tickling of idle CPUs
On 03/12/12 16:34, Dario Faggioli wrote: Right now, when a VCPU wakes-up, we check if the it should preempt what is running on the PCPU, and whether or not the waking VCPU can be migrated (by tickling some idlers). However, this can result in suboptimal or even wrong behaviour, as explained here: http://lists.xen.org/archives/html/xen-devel/2012-10/msg01732.html This change, instead, when deciding what PCPUs to tickle upon VCPU wake-up, considers both what it is likely to happen on the PCPU where the wakeup occurs, as well as whether or not there are idle PCPUs where to run the waking VCPU. In fact, if there are idlers where the new VCPU can run, we can avoid interrupting the running VCPU. OTOH, in case there aren't any of these PCPUs, preemption and migration are the way to go. This has been tested by running the following benchmarks inside 2, 6 and 10 VMs concurrently, on a shared host, each with 2 VCPUs and 960 MB of memory (host has 16 ways and 12 GB RAM). 1) All VMs had 'cpus="all"' in their config file. $ sysbench --test=cpu ... (time, lower is better) | VMs | w/o this change | w/ this change | | 2 | 50.078467 +/- 1.6676162 | 49.704933 +/- 0.0277184 | | 6 | 63.259472 +/- 0.1137586 | 62.227367 +/- 0.3880619 | | 10 | 91.246797 +/- 0.1154008 | 91.174820 +/- 0.0928781 | $ sysbench --test=memory ... (throughput, higher is better) | VMs | w/o this change | w/ this change | | 2 | 485.56333 +/- 6.0527356 | 525.57833 +/- 25.085826 | | 6 | 401.36278 +/- 1.9745916 | 421.96111 +/- 9.0364048 | | 10 | 294.43933 +/- 0.8064945 | 302.49033 +/- 0.2343978 | $ specjbb2005 ... (throughput, higher is better) | VMs | w/o this change | w/ this change | | 2 | 43150.63 +/- 1359.5616 | 42720.632 +/- 1937.4488 | | 6 | 29274.29 +/- 1024.4042 | 29518.171 +/- 1014.5239 | | 10 | 19061.28 +/- 512.88561 | 19050.141 +/- 458.77327 | 2) All VMs had their VCPUs statically pinned to the host's PCPUs. $ sysbench --test=cpu ... (time, lower is better) | VMs | w/o this change | w/ this change | | 2 | 47.8211 +/- 0.0215504 | 47.826900 +/- 0.0077872 | | 6 | 62.689122 +/- 0.0877173 | 62.764539 +/- 0.3882493 | | 10 | 90.321097 +/- 1.4803867 | 89.974570 +/- 1.1437566 | $ sysbench --test=memory ... (throughput, higher is better) | VMs | w/o this change | w/ this change | | 2 | 550.97667 +/- 2.3512355 | 550.87000 +/- 0.8140792 | | 6 | 443.15000 +/- 5.7471797 | 454.01056 +/- 8.4373466 | | 10 | 313.89233 +/- 1.3237493 | 321.81167 +/- 0.3528418 | $ specjbb2005 ... (throughput, higher is better) | 2 | 49591.057 +/- 952.93384 | 49610.98 +/- 1242.1675 | | 6 | 33538.247 +/- 1089.2115 | 33682.222 +/- 1216.1078 | | 10 | 21927.870 +/- 831.88742 | 21801.138 +/- 561.97068 | Numbers show how the change has either no or very limited impact (specjbb2005 case) or, when it does have some impact, that is an actual improvement in performances, especially in the sysbench-memory case. Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx> So I think the principle is good, but the resulting set of "if" statements is hard to figure out what's going on. What do you think about re-arranging things, something like the attached?This particular version I got rid of the stats, because they require if() statements that break up the flow. If we really think they're useful, maybe we could have a separate block somewhere for them? We could actually do without the idlers_empty entirely, as if we just remove the condition from the "else" block, the "right thing" will happen; however, it means several unnecessary cpumask operations on a busy system. Thoughts? -George Attachment:
xen_sched_credit_improve_tickling_of_idle_cpus _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |