[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 3/3] x86/time: don't move TSC backwards in time_calibration_tsc_rendezvous()
On Mon, Feb 08, 2021 at 02:59:55PM +0100, Jan Beulich wrote: > On 08.02.2021 14:19, Roger Pau Monné wrote: > > On Mon, Feb 08, 2021 at 12:22:25PM +0100, Jan Beulich wrote: > >> On 08.02.2021 10:38, Roger Pau Monné wrote: > >>> On Mon, Feb 01, 2021 at 01:43:28PM +0100, Jan Beulich wrote: > >>>> --- > >>>> Since CPU0 reads its TSC last on the first iteration, if TSCs were > >>>> perfectly sync-ed there shouldn't ever be a need to update. However, > >>>> even on the TSC-reliable system I first tested this on (using > >>>> "tsc=skewed" to get this rendezvous function into use in the first > >>>> place) updates by up to several thousand clocks did happen. I wonder > >>>> whether this points at some problem with the approach that I'm not (yet) > >>>> seeing. > >>> > >>> I'm confused by this, so on a system that had reliable TSCs, which > >>> you forced to remove the reliable flag, and then you saw big > >>> differences when doing the rendezvous? > >>> > >>> That would seem to indicate that such system doesn't really have > >>> reliable TSCs? > >> > >> I don't think so, no. This can easily be a timing effect from the > >> heavy cache line bouncing involved here. > >> > >> What I'm worried here seeing these updates is that I might still > >> be moving TSCs backwards in ways observable to the rest of the > >> system (i.e. beyond the inherent property of the approach), and > >> this then getting corrected by a subsequent rendezvous. But as > >> said - I can't see what this could result from, and hence I'm > >> inclined to assume these are merely effects I've not found a > >> good explanation for so far. > > > > I'm slightly worried by this, maybe because I'm misunderstanding part > > of the TSC sync stuff. > > > > So you forced a system that Xen would otherwise consider to have a > > reliable TSC (one that doesn't need a calibration rendezvous) into > > doing the calibration rendezvous, and then the skew seen is quite big. > > I would expect such skew to be minimal? As we would otherwise consider > > the system to not need calibration at all. > > > > This makes me wonder if the system does indeed need such calibration > > (which I don't think so), or if the calibration that we actually try > > to do is quite bogus? > > I wouldn't call it bogus, but it's not very precise. Hence me > saying that if we wanted to make the problematic system seen as > TSC_RELIABLE (and hence be able to switch from "tsc" to "std" > rendezvous), we'd first need to improve accuracy here quite a bit. > (I suspect sufficient accuracy can only be achieved by making use > of TSC_ADJUST, but that's not available on the reporter's hardware, > so of no immediate interest here.) Right, TSC_ADJUST does indeed seem to be a better way to adjust TSC, and to notice if firmware has skewed them. > > >>>> @@ -1719,9 +1737,12 @@ static void time_calibration_tsc_rendezv > >>>> while ( atomic_read(&r->semaphore) > total_cpus ) > >>>> cpu_relax(); > >>>> } > >>>> + > >>>> + /* Just in case a read above ended up reading zero. */ > >>>> + tsc += !tsc; > >>> > >>> Won't that be worthy of an ASSERT_UNREACHABLE? I'm not sure I see how > >>> tsc could be 0 on a healthy system after the loop above. > >> > >> It's not forbidden for the firmware to set the TSCs to some > >> huge negative value. Considering the effect TSC_ADJUST has on > >> the actual value read by RDTSC, I think I did actually observe > >> a system coming up this way, because of (not very helpful) > >> TSC_ADJUST setting by firmware. So no, no ASSERT_UNREACHABLE() > >> here. > > > > But then the code here will loop 5 times, and it's not possible for > > those 5 loops to read a TSC value of 0? I could see it reading 0 on > > one of the iterations but not in all of them. > > Sure, we can read zero at most once here. Yet the "if ( tsc == 0 )" > conditionals get executed on every iteration, while they must yield > "true" only on the first (from the variable's initializer); we > absolutely need to go the "else if()" path on CPU0 on the 2nd > iteration, and we also need to skip that part on later iterations > on the other CPUs (for CPU0 to then take the 2nd "else if()" path > from no later than the 3rd iteration onwards; the body of this of > course will only be executed on the last iteration). Oh, I see. Then I think I have no further comments. If you agree to adjust the cmpxchg please add by R-b. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |