[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Workaround for buggy PIT



Hi all,
I am using xen for some time now and I am very happy with it, thanks for
all the good work.
The only problem I had with xen on our server was that sometimes
(usually few times per day) the time in dom0 went berserk and started
running about three times faster. The only fix for that was reboot of
the machine.
I remember seeing similar problems reported long time ago on this list,
although I can't locate them at the moment.
Well, in my case, I traced the problem down to a buggy chipset. The
VIA686a PIT timer randomly looses it's programming and needs to be
reset. The linux kernel has a workaround for this, but this does not get
used when xen comes to play as the hypervisor takes over control of the PIT.
I have implemented similar workaround in xen hypervisor. So far I am
running it for about three weeks now and the server is perfectly stable.

I am interested in your comments, and I would be happy if you could
apply this patch to xen sources.

Thanks

Tomas Kopal



diff -r f4cef1aa2521 xen/arch/x86/time.c
--- a/xen/arch/x86/time.c       Tue Mar 14 16:09:34 2006 +0100
+++ b/xen/arch/x86/time.c       Tue Mar 14 19:14:35 2006 +0100
@@ -286,6 +286,7 @@ static u64 pit_counter64;
 static u64 pit_counter64;
 static u16 pit_stamp;
 
+#define LATCH (((CLOCK_TICK_RATE)+(HZ/2))/HZ)
 static u16 pit_read_counter(void)
 {
     u16 count;
@@ -293,6 +294,32 @@ static u16 pit_read_counter(void)
     outb(0x80, PIT_MODE);
     count  = inb(PIT_CH2);
     count |= inb(PIT_CH2) << 8;
+
+    /* VIA686 Timer bug workaround
+       The timer sometimes looses it's programming, returning huge
+       counts. The delta counted should not be more than LATCH in
+       ideal world, but due to interrupt being disabled sometimes,
+       we can get longer intervals. Account for that by accepting
+       double the LATCH value.
+       To correct the error, we need to reset the counter. The
+       information about time elapsed is lost, so assume the interval
+       was LATCH long. This can make a slight difference in timer speed
+       if we were called from timer calibration code. It should not
+       be too much and it should be corrected on next callibration round.
+    */
+    if (pit_stamp - count > 2 * LATCH)
+    {
+        printk(KERN_WARNING "PIT Timer HW error: %u\n", pit_stamp - count);
+        /* reset the timer */
+        outb(0xb0, PIT_MODE);           /* binary, mode 0, LSB/MSB, Ch 2 */
+        outb(LATCH & 0xff, PIT_CH2);    /* LSB of count */
+        outb(LATCH >> 8, PIT_CH2);      /* MSB of count */
+        /* reset the stamp */
+        pit_stamp = LATCH;
+        /* correct the count returned to make LATCH difference */
+        count = 0;
+    }
+
     return count;
 }
 
@@ -315,6 +342,13 @@ static void init_pit(void)
 static void init_pit(void)
 {
     read_platform_count = read_pit_count;
+
+    /* setup the timer */
+    spin_lock_irq(&platform_timer_lock);
+    outb(0xb0, PIT_MODE);           /* binary, mode 0, LSB/MSB, Ch 2 */
+    outb(0, PIT_CH2);    /* LSB of count */
+    outb(0, PIT_CH2);      /* MSB of count */
+    spin_unlock_irq(&platform_timer_lock);
 
     pit_overflow();
     platform_timer_stamp = pit_counter64;
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.