[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Runaway real/sys time in newer paravirt domUs?


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Jed Smith <jed@xxxxxxxxxx>
  • Date: Tue, 6 Jul 2010 12:32:31 -0400
  • Delivery-date: Tue, 06 Jul 2010 09:34:24 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Good morning,

We've had a few reports from domU customers[1] - confirmed by myself - that CPU
time accounting is very inaccurate in certain circumstances.  This issue seems
to be limited to x86_64 domUs, starting around the 2.6.32 family (but I can't be
sure of that).

The symptoms of the flaw include top reporting hours and days of CPU consumed by
a task which has been running for mere seconds of wall time, as well as the
time(1) utility reporting hundreds of years in some cases.  Contra-indicatively,
the /proc/stat timers on all four VCPUs increment at roughly the expected rate.
Needless to say, this is puzzling.

A test case which highlights the failure has been brought to our attention by
Ævar Arnfjörð Bjarmason, which is a simple Perl script[2] that forks and
executes numerous dig(1) processes.  At the end of his script, time(1) reports
268659840m0.951s of user and 38524003m13.072s of system time consumed.  I am
able to confirm this demonstration using:

 - Xen 3.4.1 on dom0 2.6.18.8-931-2
 - Debian Lenny on domU 2.6.32.12-x86_64-linode12 [3]

Running Ævar's test case looks like this, in that domU:

> real 0m30.741s
> user 307399002m50.773s
> sys 46724m44.192s

However, a quick busyloop in Python seems to report the correct time:

> li21-66:~# cat doit.py 
> for i in xrange(10000000):
>  a = i ** 5
>
> li21-66:~# time python doit.py
>
> real  0m16.600s
> user  0m16.593s
> sys   0m0.006s

I rebooted the domU, and the problem no longer exists.  It seems to be transient
in nature, and difficult to isolate.  /proc/stat seems to increment normally:

> li21-66:/proc# cat stat | grep "cpu " && sleep 1 && cat stat | grep "cpu "
> cpu  3742 0 1560 700180 1326 0 27 1282 0
> cpu  3742 0 1562 700983 1326 0 27 1282 0

I'm not sure where to begin with this one - any thoughts?

 [1]: http://www.linode.com/forums/viewtopic.php?p=30715
 [2]: git://gist.github.com/449825.git
 [3]: Source: http://www.linode.com/src/2.6.32.12-x86_64-linode12.tar.bz2
      Config: http://jedsmith.org/tmp/2.6.32.12-x86_64-linode12.txt

Thanks for the assistance,

Jed Smith
Systems Administrator
Linode, LLC
+1 (609) 593-7103 x1209
jed@xxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.