[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [Patch 0 of 2]: PV-domain SMP performance

  • To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxxxxxxx>
  • Date: Wed, 17 Dec 2008 13:21:58 +0100
  • Delivery-date: Wed, 17 Dec 2008 04:22:27 -0800
  • Domainkey-signature: s=s768; d=fujitsu-siemens.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:Subject:X-Enigmail-Version: Content-Type:Content-Transfer-Encoding; b=hOqwzVlhBZUDe3gVeYJFdPA3gULDDSO6uKIbzU36EPk0gmZIogwBdU+1 NdSuAKs9Q5YzGf27jOTMYASO/pWHNnKO1fjv71H1ioK6VdX7+/ttwQVaX OMNOk3KCuZgu9hp;
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>


I've played a little bit with the xen scheduler to enhance the performance of
paravirtualized SMP domains including Dom0.
Under heavy system load a vcpu might be descheduled in a critical section.
This in turn leads to even higher system load if other vcpus of the same
domain are waiting for the descheduled vcpu to leave the critical section.

I've created a patch for xen and for the linux kernel to show that cooperative
scheduling can help to avoid this problem or make it less critical.

A vcpu might set a flag "no_desched" in its vcpu_info structure (I've used
an unused hole) which will tell the xen scheduler to keep the vcpu running for
some more time. If the vcpu would have been descheduled otherwise, the guest
is infomred by another flag in the vcpu_info to voluntarily give up control
after leaving the critical section. If the guest is not cooperative it will be
descheduled after 1 msec anyway.

I've made some tests in Dom0 with a small benchmark producing high system load

time -p (dd if=/dev/urandom count=1M | cat >/dev/null)

The system is a 4 processor x86_64 machine running latest XEN-unstable. The
tests are performed in dom0. For each test run the sums of the time outputs
are printed (2 parallel runs lasting 60 seconds each will print 120 seconds).
Multiple tests returned very similar results (deviation about 1-2%).

First configuration: 4 vcpus, no pinning:
1 run:  real:   79.92 user:    1.04 sys:   78.83
2 runs: real:  271.28 user:    1.91 sys:  269.35
4 runs: real:  882.32 user:    5.70 sys:  875.50

Second configuration: 4 vcpus, all pinned to cpu 0:
1 run:  real:  400.55 user:    0.10 sys:  380.28
2 runs: real: 1270.68 user:    2.58 sys:  653.28
4 runs: real: 1558.27 user:   20.99 sys:  368.10

The same tests with my patches:

First configuration: 4 vcpus, no pinning:
1 run:  real:   81.85 user:    1.00 sys:   81.29
2 runs: real:  229.62 user:    2.07 sys:  191.31
4 runs: real:  878.63 user:    3.61 sys:  873.76

Second configuration: 4 vcpus, all pinned to cpu 0:
1 run:  real:  274.06 user:    0.74 sys:   58.88
2 runs: real:  999.77 user:    1.27 sys:   98.61
4 runs: real: 1251.00 user:   16.58 sys:  291.66

This result was achieved by avoiding descheduling of a vcpu only when irqs
are blocked. Even better results might be possible with some fine tuning
(e.g. instrumenting bh_enable/bh_disable).
I think system time has dropped remarkably!

Patch 1 is hypervisor support
Patch 2 is my Linux support in irq_enable and irq_disable


Juergen Gross                             Principal Developer
IP SW OS6                      Telephone: +49 (0) 89 636 47950
Fujitsu Siemens Computers         e-mail: juergen.gross@xxxxxxxxxxxxxxxxxxx
Otto-Hahn-Ring 6                Internet: www.fujitsu-siemens.com
D-81739 Muenchen         Company details: www.fujitsu-siemens.com/imprint.html

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.