[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: [PATCH] clocksource=tsc
Ah, I changed my mind. I now think it's due to overflow of a 32-bit usecs variable in Linux's do_gettimeofday(). Cause is still the same though -- the guest assuming that the timestamp record from Xen will never be 'very' stale (in this case, older than 1h12m). -- Keir On 18/7/08 12:01, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote: > Okay, I'm pretty sure the reason your patch works is that system time as > calculated in the guest is still being continually calibrated by > local_time_calibration() in Xen. You do not turn off that calibration > function and hence it will continually update the cpu_time structure with > new timestamps and scale factors. Is this really intended? > > The reason mine doesn't work is that the time record exposed to the guest > *never* changes. The calculation in the guest is thus effectively: > sys_time = TSC * scale_factor > > The problem is that this is sensitive to small errors in the scale factor! > And unfortunately we compute a new scale factor, to convert to us, as > scale_factor_ns/1000. The resulting scale factor is obviously less accurate, > and leads to an accumulating absolute error as the TSC value gets large. > > So, when all is said and done, what are we actually trying to achieve here > and how should it really work? As far as I can see we cannot avoid sending > an updated time record to guests periodically. But running > local_time_calibration() only for guests and not for Xen system time seems > pretty weird to me. > > Perhaps we should push this all off until after 3.3? > > -- Keir > > On 18/7/08 08:24, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote: > >> You shouldn't have to modify get_s_time(). Guests will still be running the >> old algorithm (sys_stamp + (now_tsc - stamp_tsc) * scale_tsc), so existing >> get_s_time() ought to work. In fact it must work. >> >> -- Keir >> >> On 18/7/08 00:05, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote: >> >>> After trying lots of things, I fell back to my known >>> working version of time.c and adapted it forward to >>> match tip. The attached patch (on top of 18063) boots >>> clocksource=tsc and doesn't display the weirdness for >>> 2-1/2 hours of testing (always showed up around 1 hour before). >>> >>> It may not be elegant but it works and tip doesn't (with >>> clocksource=tsc). >>> >>> Dan >>> >>>> -----Original Message----- >>>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] >>>> Sent: Wednesday, July 16, 2008 6:50 AM >>>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail) >>>> Cc: Dave Winchell >>>> Subject: Re: [PATCH] clocksource=tsc >>>> >>>> >>>> That's a weird set of symptoms. Perhaps dom0's sense of system time is >>>> diverging from Xen's? I don't see that CPUs can diverge, if >>>> their TSCs are >>>> in sync, since we shouldn't be dynamically modifying the >>>> per-CPU timestamps >>>> or scale factors. >>>> >>>> -- Keir >>>> >>>> On 16/7/08 13:43, "Dan Magenheimer" >>>> <dan.magenheimer@xxxxxxxxxx> wrote: >>>> >>>>> Well now I have to take that back. It DOESN'T work yet. >>>>> I think I am experiencing "Weirdness can happen..." >>>>> when booting with clocksource=tsc... I was away from >>>>> the machine overnight but the symptoms I've seen before >>>>> are that the system becomes less snappy and eventually >>>>> grinds to a near-halt. >>>>> >>>>> Oddly, I can login most of the way on the console >>>>> and launch new xterm's in my VNC display, but I never >>>>> get a prompt, and I can't interrupt a process I left >>>>> running overnight in another xterm. The time display >>>>> in gnome seems to have frozen about an hour after >>>>> I booted. Pinging the machine works but ssh'ing to >>>>> it doesn't ("Connection closed") >>>>> >>>>>> -----Original Message----- >>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] >>>>>> Sent: Tuesday, July 15, 2008 10:12 PM >>>>>> To: 'dan.magenheimer@xxxxxxxxxx'; 'Keir Fraser'; >>>> 'Xen-Devel (E-mail)' >>>>>> Cc: 'Dave Winchell' >>>>>> Subject: RE: [PATCH] clocksource=tsc >>>>>> >>>>>> >>>>>> OK, ignore that. It looks like you have it all fixed >>>>>> in 18060. I tried it and it now boots. Thanks! >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] >>>>>>> Sent: Tuesday, July 15, 2008 7:16 PM >>>>>>> To: 'dan.magenheimer@xxxxxxxxxx'; 'Keir Fraser'; 'Xen-Devel >>>>>> (E-mail)' >>>>>>> Cc: 'Dave Winchell' >>>>>>> Subject: RE: [PATCH] clocksource=tsc >>>>>>> >>>>>>> >>>>>>>>>> Returning to 32-bit read_counter(), and having NULL >>>>>>>>> read_counter when >>>>>>>>>> clocksource=tsc would be another possibility... >>>>>>> >>>>>>> Well I hacked on 18055 for awhile and just couldn't get it >>>>>>> to boot. I think local_time_calibration() (and thus >>>>>>> init_percpu_time()) is necessary for boot, though I'm not really >>>>>>> sure why. Possibly the "Weirdness can happen..." comment in >>>>>>> that routine? >>>>>>> >>>>>>> Anyway, this patch (on top of 18055) DOES work, returns to the >>>>>>> 32-bit read_counter, and re-enables local_time_calibration(). >>>>>>> I'd suggest putting off more major surgery for another day. >>>>>>> >>>>>>> Thanks, >>>>>>> Dan >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] >>>>>>>> Sent: Tuesday, July 15, 2008 10:04 AM >>>>>>>> To: dan.magenheimer@xxxxxxxxxx; Keir Fraser; Xen-Devel (E-mail) >>>>>>>> Cc: Dave Winchell >>>>>>>> Subject: RE: [PATCH] clocksource=tsc >>>>>>>> >>>>>>>> >>>>>>>> Hmmm... 18055 also fails to boot on my machine. >>>>>>>> >>>>>>>> Could we perhaps fall back to my original patch and do >>>>>>>> cleanup later/separately? I also want to try implementing >>>>>>>> an hpet64-based get_s_time() so will be working more >>>>>>>> in this code later... but want to get clocksource=tsc >>>>>>>> working now with minimal code impact given the freeze. >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] >>>>>>>>> Sent: Tuesday, July 15, 2008 9:46 AM >>>>>>>>> To: 'Keir Fraser'; 'Xen-Devel (E-mail)' >>>>>>>>> Cc: 'Dave Winchell' >>>>>>>>> Subject: RE: [PATCH] clocksource=tsc >>>>>>>>> >>>>>>>>>> Actually in this mode of operation we hardly need a platform >>>>>>>>>> timer *at all*. >>>>>>>>>> The idea is that we let the TSCs free-run, because we know >>>>>>>>>> they will behave. >>>>>>>>>> Returning to 32-bit read_counter(), and having NULL >>>>>>>>> read_counter when >>>>>>>>>> clocksource=tsc would be another possibility... >>>>>>>>> >>>>>>>>> That's essentially what the original tscstable.patch did, though >>>>>>>>> I was perhaps much uglier in the miscellaneous parts. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Dan >>>>>>>>> >>>>>>>> >>>>>>>> >>>>> >>>> >>>> >>>> >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxxxxxxxx >> http://lists.xensource.com/xen-devel > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |