[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: Large system boot problems
Keir Fraser wrote: > On 8/2/08 15:22, "Bill Burns" <bburns@xxxxxxxxxx> wrote: > >>> But ultimately the calibration code should be robust to long delays before >>> it is executed. It shouldn't go haywire. So something is bad there. Do you >>> have a dump of the decision made by the calibration code on cpu0 the very >>> first time it actually gets invoked? We probably need to trace the hell out >>> of that first invocation to work out why it gets things so badly wrong. >> I don't have more than in the earlier email where is shows the >> large delta in tsc time, which seems to cause the bogus result. > > Okay, well looking at the inputs on that first invocation -- master_stime > and local_stime -- they are totally out of sync. One says that 9.3s has > elapsed since init_xen_time() was invoked, the other says that 4.6s has > elapsed (curiously exactly half the time). The former is correct if the CPU > really is a 3.4GHz part and is running at full speed for the duration. But > you ought to be able to work out which is the correct ballpark by timing > with a stopwatch the time between init_xen_time() and that first invocation > on cpu0 of local_time_calibration() (you'll have to printk() when > init_xen_time() is executed). > > -- Keir > > Well, I have a proposed fix that fixes the major symptom of dom0 reporting time going backwards and failing it initialize properly. I must note that dom0 still reports the wrong speed for CPU0 when only one iteration of local_time_calibration occurs before dom0 gets going. I believe that second issue is probably due to the large delta between the master and local stime. The first call to local_time_calibration automatically fixes local stime being behind. But when a significant amount of time has elapsed before the initial call to local_time_calibration the code that deals with the local stime and tsc deltas is broken. When the 64 bit deltas for local stime is manipulated down to a 32 bit value the tsc delta is also adjusted, but the tsc_shift value is not maintained. There are two loops. The first shifts both the stime and tsc vaules in sync but fails to record the tsc shift: while ( ((u32)stime_elapsed64 != stime_elapsed64) || ((s32)stime_elapsed64 < 0) ) { stime_elapsed64 >>= 1; tsc_elapsed64 >>= 1; ++ tsc_shift--; } The second does the tsc shift alone, which is fine, but note that it does record the tsc shift. /* tsc_elapsed <= 2*stime_elapsed */ while ( tsc_elapsed64 > (stime_elapsed32 * 2) ) { tsc_elapsed64 >>= 1; tsc_shift--; } Making this one line change, as in the attached patch yields a properly working dom0. Tested on both a small memory and large memory system. Bill --- arch/x86/time.c.orig 2008-02-12 07:16:48.000000000 -0500 +++ arch/x86/time.c 2008-02-12 11:19:47.000000000 -0500 @@ -857,6 +857,7 @@ static void local_time_calibration(void { stime_elapsed64 >>= 1; tsc_elapsed64 >>= 1; + tsc_shift--; } /* stime_master_diff now fits in a 32-bit word. */ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |