[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Time went backwards / Stability issues



Myeah, xen on 2.6.22 is a bit flawed in that area so it seems :)
I've had the same troubles with the Ubuntu 2.6.22-14-xen kernel. 
You can try settings your clocksource to jiffies instead of xen, which seem to 
fix the problem..
I 'fixed' the problem here to just compile my own ( 2.6.18 ) kernel.
The people at xensource aren't too bothered to keep their source up to date 
with the latest kernels, which ( after 5 releases ) starts getting harder and 
harder for external people to port it, and more bugs start cropping in..

Kind regards,

Bart

----- Original Message -----
From: "Ian Marlier" <ian.marlier@xxxxxxxxxxxxxxxxxxx>
To: xen-users@xxxxxxxxxxxxxxxxxxx
Sent: Friday, October 26, 2007 6:13:52 PM (GMT+0100) Europe/Berlin
Subject: [Xen-users] Time went backwards / Stability issues

Hi, all --

I'm in the process of configuring a new machine for use as a Xen server,
and am having some rather significant stability issues.

The hardware/distro/kernel info is:
opensuse 10.3
Linux offxen2 2.6.22.5-31-xen #1 SMP 2007/09/21 22:29:00 UTC x86_64
x86_64 x86_64 GNU/Linux
2 x Dual-Core AMD Opteron(tm) Processor 2212 HE
8 x 2GB PC2-5300 RAM
Xen 3.1.0_15042-51

I currently have three guest domains created:
- 1 opensuse 10.3, paravirtualized
- 1 Windows XP, HVM
- 1 Windows 2003, HVM

I can start all three domains, etc.

The trouble that I'm having appears as soon as I begin to stress-test
the configuration.  Running standard burn-in testing tools (BurnInTest,
for example), I quickly manage to lock the machine up.

Testing this morning, for example, I ran BurnInTest in all four domains
-- dom0, and all three domU's.  It was configured to run with only CPU,
Memory, and Disk tests.  I'll grant that I'm putting far more than
standard load on the hardware, but, well...

Within 3 minutes of the tests beginning, the host machine became
entirely unresponsive at the console, or to the network.

After reboot, the following messages appear in the messages file on
dom0: 
Oct 26 11:16:24 offxen2 kernel: clocksource/0: Time went backwards:
delta=-87834561 shadow=1219075978646569 offset=40156133
Oct 26 11:16:24 offxen2 kernel: clocksource/3: Time went backwards:
delta=-60020543 shadow=1219076192448184 offset=116224924
Oct 26 11:16:24 offxen2 kernel: clocksource/3: Time went backwards:
delta=-30041758 shadow=1219076192448184 offset=242467112
Oct 26 11:16:28 offxen2 kernel: clocksource/2: Time went backwards:
delta=-29692236 shadow=1219079978661601 offset=564764018
Oct 26 11:16:34 offxen2 kernel: clocksource/0: Time went backwards:
delta=-21729405 shadow=1219085978683108 offset=317335629
Oct 26 11:16:37 offxen2 kernel: clocksource/3: Time went backwards:
delta=-15880975 shadow=1219087978689164 offset=670825178
Oct 26 11:16:37 offxen2 kernel: clocksource/1: Time went backwards:
delta=-121188552 shadow=1219089233041322 offset=181068207
Oct 26 11:16:39 offxen2 kernel: clocksource/2: Time went backwards:
delta=-130702951 shadow=1219090051851342 offset=632907468
Oct 26 11:16:50 offxen2 kernel: clocksource/0: Time went backwards:
delta=-150041357 shadow=1219101192548457 offset=648236947
Oct 26 11:16:52 offxen2 kernel: clocksource/1: Time went backwards:
delta=-43577936 shadow=1219103192555550 offset=917191565
Oct 26 11:17:00 offxen2 kernel: clocksource/2: Time went backwards:
delta=-30133282 shadow=1219112051923316 offset=182801944

Similarly, these messages appear in the messages file in the opensuse
domU:
Oct 26 11:14:42 mx1 kernel: clocksource/0: Time went backwards:
delta=-30524524 shadow=1218974192040103 offset=299209428
Oct 26 11:14:42 mx1 kernel: klogd 1.4.1, ---------- state change
---------- 
Oct 26 11:14:50 mx1 kernel: clocksource/0: Time went backwards:
delta=-35339594 shadow=1218980978262608 offset=653280945
Oct 26 11:14:50 mx1 kernel: clocksource/1: Time went backwards:
delta=-32626156 shadow=1218980978262608 offset=685231574
Oct 26 11:15:06 mx1 kernel: clocksource/1: Time went backwards:
delta=-33094544 shadow=1218997051525512 offset=883292081
Oct 26 11:15:07 mx1 kernel: clocksource/1: Time went backwards:
delta=-28049569 shadow=1218998978340581 offset=27113639
Oct 26 11:15:07 mx1 kernel: clocksource/0: Time went backwards:
delta=-29452075 shadow=1218999192142335 offset=173497295
Oct 26 11:15:10 mx1 kernel: clocksource/0: Time went backwards:
delta=-24180432 shadow=1219001232726262 offset=495644020
Oct 26 11:15:12 mx1 kernel: clocksource/1: Time went backwards:
delta=-29407880 shadow=1219004232742124 offset=267503786
Oct 26 11:15:14 mx1 kernel: clocksource/1: Time went backwards:
delta=-72293790 shadow=1219005978366429 offset=618198368
Oct 26 11:15:19 mx1 kernel: clocksource/1: Time went backwards:
delta=-30010141 shadow=1219010051576699 offset=685063187
Oct 26 11:15:25 mx1 kernel: clocksource/0: Time went backwards:
delta=-34145209 shadow=1219016192222797 offset=464415813
Oct 26 11:15:26 mx1 kernel: clocksource/1: Time went backwards:
delta=-11964123 shadow=1219017192226704 offset=564340684
Oct 26 11:15:26 mx1 kernel: clocksource/0: Time went backwards:
delta=-11068356 shadow=1219017051577257 offset=721244578
Oct 26 11:15:30 mx1 kernel: clocksource/1: Time went backwards:
delta=-11507778 shadow=1219022192247457 offset=191448477
Oct 26 11:15:31 mx1 kernel: clocksource/0: Time went backwards:
delta=-15586265 shadow=1219022978434436 offset=66316116
Oct 26 11:15:35 mx1 kernel: clocksource/0: Time went backwards:
delta=-55640150 shadow=1219027051616803 offset=76664416
Oct 26 11:15:40 mx1 kernel: clocksource/1: Time went backwards:
delta=-27264629 shadow=1219031192284307 offset=827170075
Oct 26 11:15:48 mx1 kernel: clocksource/0: Time went backwards:
delta=-13974148 shadow=1219039232839657 offset=399204868
Oct 26 11:15:57 mx1 kernel: clocksource/0: Time went backwards:
delta=-32130780 shadow=1219047978530514 offset=790642402
Oct 26 11:15:57 mx1 kernel: clocksource/0: Time went backwards:
delta=-30011595 shadow=1219048051699939 offset=989293882
Oct 26 11:16:00 mx1 kernel: clocksource/0: Time went backwards:
delta=-16655720 shadow=1219051192343976 offset=787632598
Oct 26 11:16:03 mx1 kernel: clocksource/0: Time went backwards:
delta=-11904065 shadow=1219054192356289 offset=971736210
Oct 26 11:16:06 mx1 kernel: clocksource/1: Time went backwards:
delta=-23181648 shadow=1219057192369484 offset=459549949
Oct 26 11:16:07 mx1 kernel: clocksource/1: Time went backwards:
delta=-30345878 shadow=1219058978577439 offset=435960608
Oct 26 11:16:10 mx1 kernel: clocksource/0: Time went backwards:
delta=-31035353 shadow=1219061051756571 offset=795467413
Oct 26 11:16:11 mx1 kernel: clocksource/0: Time went backwards:
delta=-20089266 shadow=1219062978593081 offset=101685066
Oct 26 11:16:16 mx1 kernel: printk: 50269 messages suppressed.
Oct 26 11:16:16 mx1 kernel: clocksource/1: Time went backwards:
delta=-24188173 shadow=1219067232963516 offset=765471054
Oct 26 11:16:21 mx1 kernel: printk: 28000 messages suppressed.
Oct 26 11:16:21 mx1 kernel: clocksource/1: Time went backwards:
delta=-30051890 shadow=1219073051795398 offset=556484104
Oct 26 11:16:26 mx1 kernel: clocksource/0: Time went backwards:
delta=-60024893 shadow=1219077192453422 offset=815345740
Oct 26 11:16:31 mx1 kernel: printk: 1 messages suppressed.
Oct 26 11:16:31 mx1 kernel: clocksource/0: Time went backwards:
delta=-805691001 shadow=1219082978673021 offset=152356711
Oct 26 11:16:37 mx1 kernel: printk: 2 messages suppressed.
Oct 26 11:16:37 mx1 kernel: clocksource/1: Time went backwards:
delta=-10843062 shadow=1219088233054364 offset=442409918
Oct 26 11:16:43 mx1 kernel: clocksource/1: Time went backwards:
delta=-40515520 shadow=1219094233037687 offset=442490158
Oct 26 11:16:55 mx1 kernel: printk: 10106 messages suppressed.
Oct 26 11:16:55 mx1 kernel: clocksource/1: Time went backwards:
delta=-50129861 shadow=1219106192568569 offset=847989665


Googling around, I've found references to this issue appearing at
various times in the past, but haven't seen anything that appears to be
a standard and/or confirmed workaround for it.  I have seen suggestions
like using NTP, etc, and have tried those, but without success.

I'm wondering if there are any known workarounds, or if I'm SOL on this
particular issue.

For what it's worth, we saw this same issue several weeks ago, when we
first configured the machine, with only light load on the box.  (A
single opensuse 10.3 guest domain, being tested as a mail relay,
receiving and doing spam analysis on about 10-20 messages per minute.)
So, while the load I was putting on is quite heavy, I do know that the
same problem can occur in a non-stress environment.  It just seems to
happen much more quickly when being stressed.

Any suggestions/comments/etc welcome!

Thanks,

Ian

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.