[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [PATCH] clocksource=tsc



Ah, I changed my mind. I now think it's due to overflow of a 32-bit usecs
variable in Linux's do_gettimeofday(). Cause is still the same though -- the
guest assuming that the timestamp record from Xen will never be 'very' stale
(in this case, older than 1h12m).

 -- Keir

On 18/7/08 12:01, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:

> Okay, I'm pretty sure the reason your patch works is that system time as
> calculated in the guest is still being continually calibrated by
> local_time_calibration() in Xen. You do not turn off that calibration
> function and hence it will continually update the cpu_time structure with
> new timestamps and scale factors. Is this really intended?
> 
> The reason mine doesn't work is that the time record exposed to the guest
> *never* changes. The calculation in the guest is thus effectively:
>  sys_time = TSC * scale_factor
> 
> The problem is that this is sensitive to small errors in the scale factor!
> And unfortunately we compute a new scale factor, to convert to us, as
> scale_factor_ns/1000. The resulting scale factor is obviously less accurate,
> and leads to an accumulating absolute error as the TSC value gets large.
> 
> So, when all is said and done, what are we actually trying to achieve here
> and how should it really work? As far as I can see we cannot avoid sending
> an updated time record to guests periodically. But running
> local_time_calibration() only for guests and not for Xen system time seems
> pretty weird to me.
> 
> Perhaps we should push this all off until after 3.3?
> 
>  -- Keir
> 
> On 18/7/08 08:24, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:
> 
>> You shouldn't have to modify get_s_time(). Guests will still be running the
>> old algorithm (sys_stamp + (now_tsc - stamp_tsc) * scale_tsc), so existing
>> get_s_time() ought to work. In fact it must work.
>> 
>>  -- Keir
>> 
>> On 18/7/08 00:05, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:
>> 
>>> After trying lots of things, I fell back to my known
>>> working version of time.c and adapted it forward to
>>> match tip.  The attached patch (on top of 18063) boots
>>> clocksource=tsc and doesn't display the weirdness for
>>> 2-1/2 hours of testing (always showed up around 1 hour before).
>>> 
>>> It may not be elegant but it works and tip doesn't (with
>>> clocksource=tsc).
>>> 
>>> Dan
>>> 
>>>> -----Original Message-----
>>>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>>>> Sent: Wednesday, July 16, 2008 6:50 AM
>>>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
>>>> Cc: Dave Winchell
>>>> Subject: Re: [PATCH] clocksource=tsc
>>>> 
>>>> 
>>>> That's a weird set of symptoms. Perhaps dom0's sense of system time is
>>>> diverging from Xen's? I don't see that CPUs can diverge, if
>>>> their TSCs are
>>>> in sync, since we shouldn't be dynamically modifying the
>>>> per-CPU timestamps
>>>> or scale factors.
>>>> 
>>>>  -- Keir
>>>> 
>>>> On 16/7/08 13:43, "Dan Magenheimer"
>>>> <dan.magenheimer@xxxxxxxxxx> wrote:
>>>> 
>>>>> Well now I have to take that back.  It DOESN'T work yet.
>>>>> I think I am experiencing "Weirdness can happen..."
>>>>> when booting with clocksource=tsc... I was away from
>>>>> the machine overnight but the symptoms I've seen before
>>>>> are that the system becomes less snappy and eventually
>>>>> grinds to a near-halt.
>>>>> 
>>>>> Oddly, I can login most of the way on the console
>>>>> and launch new xterm's in my VNC display, but I never
>>>>> get a prompt, and I can't interrupt a process I left
>>>>> running overnight in another xterm.  The time display
>>>>> in gnome seems to have frozen about an hour after
>>>>> I booted.  Pinging the machine works but ssh'ing to
>>>>> it doesn't ("Connection closed")
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
>>>>>> Sent: Tuesday, July 15, 2008 10:12 PM
>>>>>> To: 'dan.magenheimer@xxxxxxxxxx'; 'Keir Fraser';
>>>> 'Xen-Devel (E-mail)'
>>>>>> Cc: 'Dave Winchell'
>>>>>> Subject: RE: [PATCH] clocksource=tsc
>>>>>> 
>>>>>> 
>>>>>> OK, ignore that.  It looks like you have it all fixed
>>>>>> in 18060.  I tried it and it now boots.  Thanks!
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
>>>>>>> Sent: Tuesday, July 15, 2008 7:16 PM
>>>>>>> To: 'dan.magenheimer@xxxxxxxxxx'; 'Keir Fraser'; 'Xen-Devel
>>>>>> (E-mail)'
>>>>>>> Cc: 'Dave Winchell'
>>>>>>> Subject: RE: [PATCH] clocksource=tsc
>>>>>>> 
>>>>>>> 
>>>>>>>>>> Returning to 32-bit read_counter(), and having NULL
>>>>>>>>> read_counter when
>>>>>>>>>> clocksource=tsc would be another possibility...
>>>>>>> 
>>>>>>> Well I hacked on 18055 for awhile and just couldn't get it
>>>>>>> to boot.  I think local_time_calibration() (and thus
>>>>>>> init_percpu_time()) is necessary for boot, though I'm not really
>>>>>>> sure why.  Possibly the "Weirdness can happen..." comment in
>>>>>>> that routine?
>>>>>>> 
>>>>>>> Anyway, this patch (on top of 18055) DOES work, returns to the
>>>>>>> 32-bit read_counter, and re-enables local_time_calibration().
>>>>>>> I'd suggest putting off more major surgery for another day.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Dan
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
>>>>>>>> Sent: Tuesday, July 15, 2008 10:04 AM
>>>>>>>> To: dan.magenheimer@xxxxxxxxxx; Keir Fraser; Xen-Devel (E-mail)
>>>>>>>> Cc: Dave Winchell
>>>>>>>> Subject: RE: [PATCH] clocksource=tsc
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hmmm... 18055 also fails to boot on my machine.
>>>>>>>> 
>>>>>>>> Could we perhaps fall back to my original patch and do
>>>>>>>> cleanup later/separately?  I also want to try implementing
>>>>>>>> an hpet64-based get_s_time() so will be working more
>>>>>>>> in this code later... but want to get clocksource=tsc
>>>>>>>> working now with minimal code impact given the freeze.
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
>>>>>>>>> Sent: Tuesday, July 15, 2008 9:46 AM
>>>>>>>>> To: 'Keir Fraser'; 'Xen-Devel (E-mail)'
>>>>>>>>> Cc: 'Dave Winchell'
>>>>>>>>> Subject: RE: [PATCH] clocksource=tsc
>>>>>>>>> 
>>>>>>>>>> Actually in this mode of operation we hardly need a platform
>>>>>>>>>> timer *at all*.
>>>>>>>>>> The idea is that we let the TSCs free-run, because we know
>>>>>>>>>> they will behave.
>>>>>>>>>> Returning to 32-bit read_counter(), and having NULL
>>>>>>>>> read_counter when
>>>>>>>>>> clocksource=tsc would be another possibility...
>>>>>>>>> 
>>>>>>>>> That's essentially what the original tscstable.patch did, though
>>>>>>>>> I was perhaps much uglier in the miscellaneous parts.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Dan
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.