[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] xen-3.2.0 problems with time (and space)


  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: Massimo Mongardini <massimo.mongardini@xxxxxxxxx>
  • Date: Tue, 08 Jul 2008 11:05:10 +0100
  • Cc: massimo.mongardini@xxxxxxxxx
  • Delivery-date: Tue, 08 Jul 2008 03:05:41 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=F6lU05kYn9KE2uw861VI6dcU06ICrj9ZIWCZ6s0fMNiA33oYU6SJe71lr4ZSPseeZg zD7Cd2S+i5SMKicXIOf1oLLG36SYC9L60dlfE40cg84ndGr1CpUGQ+ByiuNND9LfT5NB 21NDNPfjz1IMYPPsFezJBzdphkoLD9aid8xmo=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

ein@xxxxxxxxxxxxxx wrote:
[...]
Just for the sake of completeness, can you  tell us a bit more about your
platform?
Are you running a ‘stock’ xen kernel for dom0 and domU, or are these
vendor supplied (rh) kernels?
If it’s vendor supplied, is it possible to test again using the xen
supplied 3.2 2.6.18 kernel?
I’m at a loss, but with a bit more info maybe we can nail it down.




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
Hi all,
some notes on this...
I tried various setups with 3.2.1:
setting independent_wallclock to 0 or 1
enabling/disabling ntp.
All this with no luck. the domU clock doesn't settle.

The puzzling thing is that if I install xen-3.1.4 using the rhel4.5 rpms from xen.org with exactly the same config and clock settings (indipendent_wallclock set to 1 and ntp) it works just fine. for instance server-0038 is a rhel4 domU on xen-3.1.4 while server-0040 is the same image on a xen-3.2.1, the bare metal underneeth is the same (HP BL654c) and each dom0 has one domU with all the 4 cpus allocated.

[root@server-0000 ~]# ssh -2 server-0040 "date" ; date
Tue Jul 8 08:43:30 GMT 2008
Tue Jul 8 08:41:36 GMT 2008
[root@server-0000 ~]# ssh -2 server-0038 "date" ; date
Tue Jul 8 08:41:57 GMT 2008
Tue Jul 8 08:41:57 GMT 2008

[root@server-0000 ~]# ssh -2 server-0038 "ntpq -p"
remote refid st t when poll reach delay offset jitter
==============================================================================
+server-0000 72.249.76.84 3 u 763 1024 377 0.411 34.468 5.551
*server-0001 64.202.112.65 3 u 787 1024 377 2.131 30.113 5.384
LOCAL(0) LOCAL(0) 10 l 36 64 377 0.000 0.000 0.004
[root@server-0000 ~]# ssh -2 server-0040 "ntpq -p"
remote refid st t when poll reach delay offset jitter
==============================================================================
server-0000 72.249.76.84 3 u 278 1024 377 0.182 -115765 3064.15
server-0001 64.202.112.65 3 u 292 1024 377 2.191 -115736 3061.46
*LOCAL(0) LOCAL(0) 10 l 12 64 377 0.000 0.000 0.001

ntpd logs for server-0040 say:
Jul 7 22:56:42 server-0040 ntpd[5514]: kernel time sync status 0040
Jul 7 22:56:42 server-0040 ntpd[5514]: frequency initialized 354.704 PPM from /var/lib/ntp/drift
Jul 7 22:59:57 server-0040 ntpd[5514]: synchronized to LOCAL(0), stratum 10
Jul 7 22:59:57 server-0040 ntpd[5514]: kernel time sync disabled 0041
Jul 7 23:01:02 server-0040 ntpd[5514]: synchronized to 10.5.1.2, stratum 3
Jul 7 23:06:19 server-0040 ntpd[5514]: synchronized to 10.5.1.3, stratum 3
Jul 7 23:08:32 server-0040 ntpd[5514]: synchronized to 10.5.1.2, stratum 3
Jul 7 23:15:52 server-0040 ntpd[5514]: synchronized to 10.5.1.3, stratum 3
Jul 7 23:15:52 server-0040 ntpd[5514]: time reset -2.931861 s
Jul 7 23:15:52 server-0040 ntpd[5514]: kernel time sync enabled 0001
Jul 7 23:20:11 server-0040 ntpd[5514]: synchronized to LOCAL(0), stratum 10
Jul 7 23:25:31 server-0040 ntpd[5514]: synchronized to 10.5.1.2, stratum 3
Jul 7 23:32:58 server-0040 ntpd[5514]: synchronized to LOCAL(0), stratum 10
Jul 8 04:32:27 server-0040 ntpd[5514]: synchronized to 10.5.1.3, stratum 3
Jul 8 04:41:52 server-0040 ntpd[5514]: synchronized to LOCAL(0), stratum 10

hwclocks:
[root@server-0000 ~]# ssh server-0017 "xm list ; hwclock --show ; date" ; ssh server-0040 "hwclock --show ; date"
Name ID Mem VCPUs State Time(s)
Domain-0 0 1906 4 r----- 15942.0
server-0040 1 6144 4 r----- 4778105.8
Tue 08 Jul 2008 09:24:07 AM GMT -0.329532 seconds
Tue Jul 8 09:24:04 GMT 2008
Tue 08 Jul 2008 09:26:05 AM GMT -0.022265 seconds
Tue Jul 8 09:26:06 GMT 2008

[root@server-0000 ~]# ssh server-0015 "xm list ; hwclock --show ; date" ; ssh server-0038 "hwclock --show ; date"
Name ID Mem VCPUs State Time(s)
Domain-0 0 1910 4 r----- 825.6
server-0038 3 6144 4 r----- 263230.6
Tue 08 Jul 2008 09:25:06 AM GMT -0.947071 seconds
Tue Jul 8 09:25:04 GMT 2008
Tue 08 Jul 2008 09:25:07 AM GMT -0.903970 seconds
Tue Jul 8 09:25:06 GMT 2008

the hwclock difference could be explainable by the indipendent_wallclock setting but it seems to ignore the ntp SYNC_HWCLOCK=yes setting.

Another oddity that I noticed is that there are dropped packets in the domUs eth0
[root@server-0000 ~]# ssh server-0040 ifconfig eth0 | grep dropped
RX packets:30191545 errors:0 dropped:304902 overruns:0 frame:0
TX packets:8966342 errors:0 dropped:0 overruns:0 carrier:0
[root@server-0000 ~]# ssh server-0038 ifconfig eth0 | grep dropped
RX packets:66115 errors:0 dropped:95 overruns:0 frame:0
TX packets:33971 errors:0 dropped:0 overruns:0 carrier:0

it looks like that there are far more dropped packets on the 3.2.1 one than the 3.1.0 making a quick and rough guess:
[root@server-0000 ~]# ssh server-0038 uptime
09:31:19 up 22:14, 0 users, load average: 3.99, 3.93, 3.74
[root@server-0000 ~]# ssh server-0040 uptime
09:33:26 up 24 days, 53 min, 0 users, load average: 0.18, 1.62, 2.05

24 days = 576 hours
304902*22/576 = 11645 expected dropped packets (compared to 95)

The only difference between the two is that the dom0 that hosts server-0040 has 4 (old fashion) eth0:x IP aliases on eth0 whilst on the "good" one I had to remove the eth0:x interfaces and I would have to use the "ip" command to allocate them back, this is bacause xen 3.1.x would not handle eth0 correctly with eth0:x interfaces up.

I am planning on keep on testing this for another day then I'll revert all back to 3.1.4. Please let me know if I can be of any help in debugging this (if I am not the only one affected!) or if you have any doubts regarding my configuration.

Obviously any hint is more than appreciated!

Regards,
Massimo

--
Massimo Mongardini
~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~
echo 
'Jg!J!hjwf!zpv!bo!bqqmf!boe!zpv!hjwf!nf!bo!bqqmf-!uifo!xf!xjmm!ibwf!bo!bqqmf!fbdi/!Cvu!jg!J!hjwf!zpv!bo!jefb!boe!zpv!hjwf!nf!bo!jefb-!xf!xjmm!ibwf!uxp!jefbt!fbdi!'
 | perl -pe 's/(.)/chr(ord($1)-1)/ge'
~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~
http://massimo.mongardini.it
http://www.getthefacts.it
http://www.mongardini.it/pizza-howto
~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~-.-~
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.