[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH] warn if time calibration goes wacko (was RE: [Xen-devel] Xen 3.2.2 - Timer ISR/0: Time went backwards)
See below. Bad things will happen if these situations happen, so at least we can diagnose it easier. > -----Original Message----- > From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > Sent: Wednesday, August 06, 2008 9:47 AM > To: 'Jan Beulich'; 'Christopher S. Aker'; 'Keir Fraser' > Cc: 'xen-devel@xxxxxxxxxxxxxxxxxxx' > Subject: RE: [Xen-devel] Xen 3.2.2 - Timer ISR/0: Time went backwards > > > > 20-50+% timer interrupts. The moment this rate exceeds about 50%, > > platform time calibration breaks (as it sets the timer to > > half the overflow period). > > I've looked at that code in local_time_calibration() a few times > and even added debug code once to see if it occurs. It > didn't on my machine, but I can see how it would cause problems > if it did happen. > > Keir, would you accept a patch (or just add the two lines yourself) > to printk a warning if that "goto out" ever occurs and/or maybe > if the "scale factor is clamped"? > > (Chris, this might not be your problem so apologies for the topic > drift, but if the printk had been there awhile ago, we'd at least > know if it is or is not the problem.) > > Dan > > P.S. This is also what led to the separate thread about measuring > interrupt latency. If this problem is due to huge periods with > interrupts off, it would be nice to know. > > > -----Original Message----- > > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of > Jan Beulich > > Sent: Tuesday, August 05, 2008 1:04 AM > > To: Christopher S. Aker > > Cc: xen-devel@xxxxxxxxxxxxxxxxxxx > > Subject: Re: [Xen-devel] Xen 3.2.2 - Timer ISR/0: Time went > backwards > > > > > > This looks very similar to bug report we've got from IBM > I'm currently > > trying to research (difficult, as I can't touch the > > hardware). What I know > > so far is that we're losing, starting a few seconds after > > dom0 boot began, > > 20-50+% timer interrupts. The moment this rate exceeds about 50%, > > platform time calibration breaks (as it sets the timer to > > half the overflow > > period). Since jiffies aren't used much elsewhere, this loss > > of timer ticks > > doesn't seem to matter much elsewhere. > > > > I've got no real clue so far *why* there's such a high rate > > of lost interrupts, > > though. The only (albeit small, since appearing very > > unlikely) possibility > > would be frequent and extensive SMM entries after ACPI mode got > > enabled on the system. > > > > Btw., does -unstable exhibit the same behavior? > > > > Jan > > > > >>> "Christopher S. Aker" <caker@xxxxxxxxxxxx> 04.08.08 20:51 >>> > > Hardware: > > Xen: 3.2.1-rc2 64bit > > dom0: 2.6.18.8 at changeset 622, PAE > > > > # xm dmesg | grep -e sync -e timer > > (XEN) checking TSC synchronization across 8 CPUs: passed. > > (XEN) Platform timer overflows in 234 jiffies. > > (XEN) Platform timer is 3.579MHz ACPI PM Timer > > (XEN) Machine check exception polling timer started. > > > > Spools one of these to console every few seconds: > > > > Timer ISR/0: Time went backwards: delta=-4270576170971 > > delta_cpu=254829029 shadow=2037844042151244163 off=261710497 > > processed=2037848312989081849 cpu_processed=2037844042158081849 > > 0: 2037844042158081849 > > 1: 2037828468354081849 > > 2: 2037848312989081849 > > 3: 2037837726866081849 > > 4: 2037842059197081849 > > 5: 2037840075526081849 > > 6: 2037845844663081849 > > 7: 2037841593777081849 > > > > A few t's into Xen's console: > > > > (XEN) *** Serial input -> Xen (type 'CTRL-a' three times to > > switch input > > to DOM0) > > (XEN) Min = 2037829427350793281 ; Max = 2037848310626701146 > ; Diff = > > 18883275907865 (18883275907 microseconds) > > (XEN) Min = 2037829428349256182 ; Max = 2037848311625163843 > ; Diff = > > 18883275907661 (18883275907 microseconds) > > (XEN) Min = 2037829428565188930 ; Max = 2037848311841096807 > ; Diff = > > 18883275907877 (18883275907 microseconds) > > > > This particular box does this with 3.2.0 - 3.2.2-rc2. I > have another > > box doing the same thing, except the delta is more sane (0 - 2 > > microseconds), however eventually dom0 freezes. > > > > -Chris > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxxxxxxxx > > http://lists.xensource.com/xen-devel > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxxxxxxxx > > http://lists.xensource.com/xen-devel > > Attachment:
timewarn.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |