[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Re: DomU clock out of sync (and Dom0 too)



Dmitry Nedospasov wrote:
> 
> I was watching some logs on a domU today and i suddenly noticed that the
> timestamps were off by something on the order of 47 seconds. I was
> surprised because *I don't* run independent wall clocks. I checked
> some other domUs and the "drift" was also very close to that of the
> first domU.
> 
> I also checked another dom0, Here the domUs were "only" out of sync by
> ~11 seconds.
> 
> The dom0s are all debian squeeze with Xen 4.0.1-2. The domUs are also
> debian squeeze and utilizing PV with the ParaVirtOPs in the normal
> debian linux-image-2.6.32 kernel.
> 

I've been fighting this problem (clock running +47 seconds) for several
months.  My OS setup is like yours, dom0 is Debian Squeeze x64 running Xen
4.0.1-2.  DomU's are Debian Squeeze x64 or Lenny x86:

       dom0: Debian Squeeze x64, running ntpd
             Xen version 4.0.1 (Debian 4.0.1-2)

  Risk domU: Debian Squeeze x64, running ntpd
  Coop domU: Debian Squeeze x64, running ntpd
    T4 domU: Debian Lenny x86, not running ntpd

Last night I wrote a Perl script to remotely monitor the dom0 and domU
clocks via 'rsh <host> date +%s' from a non-Xen server.  The script runs
every minute and records any time change > 2sec from previous minute.  Here
is the result:

----------------------------------------
Fri Jul  1 23:00:05 PDT 2011
           dom0 = localtime + 1s
      Risk domU = localtime + 1s
      Coop domU = localtime + 1s
        T5 domU = localtime + 93s
----------------------------------------
Fri Jul  1 23:13:04 PDT 2011
        T5 domU = localtime + 1s ..... (ran ntpdate manually)
----------------------------------------
Sat Jul  2 05:26:04 PDT 2011
           dom0 = localtime + 47s
      Risk domU = localtime + 47s
      Coop domU = localtime + 48s
        T5 domU = localtime + 47s
----------------------------------------
Sat Jul  2 05:59:04 PDT 2011
      Risk domU = localtime + 0s
----------------------------------------
Sat Jul  2 07:50:04 PDT 2011
      Coop domU = localtime + 0s
----------------------------------------
Sat Jul  2 08:11:04 PDT 2011
           dom0 = localtime + 0s
----------------------------------------
Sat Jul  2 09:13:05 PDT 2011
        T5 domU = localtime - 1s ..... (ran ntpdate manually)

At 5:26 am, there was a "time quake" on the Xen server, which caused dom0
and all domU clocks to move ahead by 47 seconds.  Risk domU, running NTP,
corrected its clock at 5:59 am by abruptly jerking it back to normal time. 
Coop domU and dom0 also did the same thing a while later.  T5 domU, not
running NTP, never corrected itself.  I manually executed ntpdate on it.

Several things are odd about this problem.  First, the "time quake" is exact
and reporducible, +47 seconds, same as Dmitry.  My server is dual Xeon 5345
on SuperMicro X7DBR-E motherboard.  Platform timer is "3.579MHz ACPI PM
Timer" (from xm dmesg).

Secondly, I thought NTP is suppose to adjust the clock gradually (-5ms each
second) instead of skipping many seconds at once.  (Or it might be running
the clock VERY SLOWLY for a few seconds to offset +47 secs.)  Thirdly, after
the initial "time quake", domUs and dom0 had to correct their clocks
individually, at different times.

Although a long shot, I will try "clocksource=pit" in Xen command line this
weekend...

P.S. "+47 secs" often cause my Perl POE scripts to hang, that's why this is
a critical problem for me.

--
View this message in context: 
http://xen.1045712.n5.nabble.com/DomU-clock-out-of-sync-tp4395454p4545936.html
Sent from the Xen - User mailing list archive at Nabble.com.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.