[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] [Xen-devel] Xen domU Timekeeping

2012/2/16 Qrux <qrux.qed@xxxxxxxxx>:
> Thanks, Ian and Konrad, for your responses.  More questions inline...

This is really interesting. Unfortunately I still don't fully
understand how things are supposed to work in a pvops domU.
Currently we have an old PV kernel setup that relies on not having an
independent wallclock. Having all domUs running the same time on a
sub-millisecond level is, err, was, one of the killer features of Xen.
On the other hand half of all linux distro kernels had it broken :/

> Obviously, a tick-counting kernel will generate far more interrupts.  OTOH, 
> one-shot tickless timekeeping has higher latencies, AFAIK.  So, if there *is* 
> a preferred configuration for good timekeeping...

We're being far too vague there. Thats why there was stuff like
max_clock_jitter in Xen in the first place.

Could one of you explain how different the timekeeping overhead is in
a pvops domU over the old method (let me add I've already fired up
something like 200 alpinelinux domUs that run pvops and still saw
typical Xen-like low overhead.
And I think Qrux' question regarding NO_HZ is also important. There
are cases where clock jitter is really intolerable and NTP alone is
NOT suitable to fight clocks suddenly going faster. It can very well
deal with a certain, constant drift, but not with a clock that goes
nuts like one would see when running old CentOS on VMWare or

> If one doesn't use NO_HZ, are pvops kernels counting ticks to keep time?  If 
> so, which clocksource does that use?  PIT?  And, what are the proper options 
> for that?

Specifically: Which is the right clocksource for pvops?

> I'm thoroughly baffled.  Konrad basically said that domUs uses the same time 
> as the hypervisor.  That sort of implies dependent_wallclock (at least in 
> operation, if not explicitly as a kernel feature).  But, then, Ian said that 
> we need to run NTP in a pvops guest.  Which sort of implies 
> independent_wallclock.

I think they mean the same thing:
Konrad saying all domUs will start out with the RTC time. Which, btw,
is not transparent and still leaves the special casing by now showing
special hardware instead of a special case in the kernel. like i.e.
hwclock failing which gives false alerts at shutdown time.

And I think Ian is saying you need to run NTP as linux doesn't care
about the RTC (at least, not about the TIME it holds) after it booted
and something needs to keep an eye on the kernel clock to avoid drift.

> In addition, I believe NTP uses adjtimex (on Linux) to discipline the kernel 
> clock--but  not RTC (see usage).  Which would imply that it's safe to run on 
> domU (though, how accurate it can be...is unclear).

Yes, it tries. And it is not good fighting unstable erratic drift.
A recent batch of Fujitsu workstations had a bios with funny timer
issues. the bios time was super stable, but the kernel wallclock was
doing 2-3s per sec. NTP is totally unable to correct something like

> If a pvops domU cannot see the RTC, it has no fracking idea what time it is 
> (calendar/wall time).  It only knows what *relative* time it is, w.r.t. when 
> it was created.  Now, on most bare-metal setups, NTP comes into the process 
> way late.  Like, rc3.d-late.  Which means if we're doing disk-mounts in 
> rcS.d, fsck will have no idea what time it is.

I think that's not true, it should have a reasonably correct idea of "time".
Maybe 5 seconds off in a troublesome world, but not much more.

> Are there bad interactions if fsck doesn't have any clue what time it is? 
>  Does this also mean that NTP should get started earlier?  And, if so, does 
> that imply networking would need to be moved way up in rcS.d?  I assume there 
> *must* be guidelines about what *needs* to go into pvops domU 
> /etc/{rc,init}.d/, and suggested orderings.

You can try to use the _netdev flag with all but / and /usr, that way
they'll be mounted after networking is up. (The HP-UX admin in me is
going all ROFLMAO at this)

> And, finally, for my specific situation, does this help explain why my ext4 
> domU (which uses an LVM volume formatted with ext4 on dom0) won't reboot, but 
> a ext3 one will?  Is ext4 pickier about times during the loading of its 
> driver?

Maybe yes. ext4 is less rubbish that ext3 and can finally detect some
internal errors.


Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.