[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 3/6] x86/time: streamline platform time init on plt_update()



On 08/30/2016 01:31 PM, Jan Beulich wrote:
>>>> On 30.08.16 at 14:10, <joao.m.martins@xxxxxxxxxx> wrote:
>> What would be a preferred period - probably capping it to a 
>> day/hour/minutes? Or
>> keeping the current calculated value? Maximum right now in current platform 
>> timers
>> are few minutes depending on the HPET frequency.
> 
> Ideally I would see you just use the calculated value, unless that
> causes problems elsewhere. Depending on such possible problems
> would be what cap to alternatively use.

While sending v4 out, I spotted some issues with platform timer overflow
handling when we get close to u64 boundary. Specific to clocksource=tsc, and
setting up the plt_overflow, one would see at boot:

"Platform timer appears to have unexpectedly wrapped 10 or more times"

The counter doesn't wrap or stop at all.
But current overflowing checking goes like:

        count = plt_src.read_counter();
        plt_stamp64 += (count - plt_stamp) & plt_mask;

        now = NOW();
        plt_wrap = __read_platform_stime(plt_stamp64);
        plt_stamp = count;

        for (i=0;...)
        {

        plt_now = plt_wrap;
        plt_wrap = __read_platform_stime(plt_stamp64 + plt_mask + 1);

@@      if ( ABS(plt_wrap - now) > ABS(plt_now - now) )
            break; // didn't wrap around
        // did wrap
        plt_stamp64 += plt_mask + 1;

        }

        if (i!=0)
                "Platform timer appears to have unexpectedly wrapped "

Effectively the calculation in @@ doesn't handle the fact that for
clocksource=tsc, "ABS(plt_wrap - now)" would be == to "ABS(plt_now -
now)". Not that is right to be so, but TSC is a 64-bit counter and "plt_mask +
1" overflows the u64, turning the above condition into a comparison of same
value. For <= 32-bit platform timers the current math works fine, but reaching
the 64-bit boundary not so much. And then handling the wraparound further below
with "plt_stamp64 += plt_mask + 1" is effectively adding zeroes, which is
assuming that plt_stamp64/plt_stamp is alone enough to not represent any
discontinuity.

I am not sure we shouldn't handle this way at least for clocksource=tsc: we only
see issues when the delta of the two stamps overflows a u64 (computed with:
plt_stamp64 + (read_counter() - plt_stamp)). Keeping those two stamps updated
more often and we won't see issues. When the wraparound happens (in lots lots of
years away) we could not only update plt_stamp64 and plt_stamp, but also
increment stime_platform_stamp and platform_timer_stamp. This lets us compensate
when the stamps wraparound since for stime_platform_stamp (in ns) that would
only happen after STIME_MAX. Or as a simpler alternative, keeping this patch and
update plt_stamp64 (zeroing this one out) plus plt_stamp on
platform_time_calibration() as additional change.

Would that sound reasonable - am I overlooking something? To some extent this
might also applicable to the general case, although platform timer is now only
used for initial seeding so probably a non-visible issue.

Joao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.