[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] RE: [PATCH] rendezvous-based local time calibration WOW!
After two hours of constant samples with c/s 18229, max skew is at 251ns! That's 70-150x better than I was measuring just a couple of weeks ago. YMMV of course. If you are looking for another marketing-speak bullet for the 4.0 release announcement, you can call this: * Greatly improved precision for time-sensitive SMP VMs or as I am subject to American hyperbole: * Dramatically improved precision for time-sensitive SMP VMs Thanks again! Dan > -----Original Message----- > From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > Sent: Monday, August 04, 2008 11:37 AM > To: 'Keir Fraser'; 'Xen-Devel (E-mail)' > Cc: 'Ian Pratt'; 'Dave Winchell' > Subject: RE: [PATCH] rendezvous-based local time calibration WOW! > > > Looks good to me (and much cleaner). I've booted it and > will leave it running for a few hours. > > Thanks! > Dan > > > -----Original Message----- > > From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > > Sent: Monday, August 04, 2008 11:10 AM > > To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail) > > Cc: Ian Pratt; Dave Winchell > > Subject: Re: [PATCH] rendezvous-based local time calibration WOW! > > > > > > Applied as c/s 18229. I rewrote it quite a bit, although > the principle > > remains the same. > > > > -- Keir > > > > On 4/8/08 16:24, "Dan Magenheimer" > <dan.magenheimer@xxxxxxxxxx> wrote: > > > > > OK, how about this version. The rendezvous only collects > > > the key per-cpu time data then sets up a per-cpu 1ms timer > > > to later update the timestamp record and vcpu system time, > > > so neither should have racing issues. > > > > > > I've only run it for about an hour but still haven't seen > > > any skew over 600nsec so apparently it is the collection of > > > the key time data that must be closely synchronized (probably > > > to ensure the slope is correct) while exact synchronization > > > of setting the timestamp records is less important. > > > > > > Note that I'm not positive I got the clocksource=tsc part > > > correct... but am interested in your opinion on whether > > > clocksource=tsc can now be eliminated anyway (as the > > > main reason I pushed for it was because of unacceptable > > > skew which with this patch appears to be fixed). > > > > > > Signed-off-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> > > > > > >> -----Original Message----- > > >> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > > >> Sent: Sunday, August 03, 2008 11:25 AM > > >> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail) > > >> Cc: Ian Pratt; Dave Winchell > > >> Subject: Re: [PATCH] rendezvous-based local time calibration WOW! > > >> > > >> > > >> It's not safe to poke a new timestamp record from an > > interrupt handler > > >> (which is what the smp_call_function() callback functions > > >> are). Users of the > > >> timestamp records (e.g., get_s_time) need > > >> local_irq_save/restore() or an > > >> equivalent of the Linux seqlock. The latter is likely faster. > > >> I'm dubious > > >> about update_vcpu_system_time() from an interrupt handler > > >> too. It needs > > >> thought about how it might race with a context switch (change > > >> of 'current') > > >> or if it interrupts an existing invocation of > > >> update_vcpu_system_time(). > > >> > > >> -- Keir > > >> > > >> On 3/8/08 17:50, "Dan Magenheimer" > > <dan.magenheimer@xxxxxxxxxx> wrote: > > >> > > >>> The synchronization of local_time_calibration (l_t_c) via > > >>> round-to-nearest-epoch provided some improvement, but I was > > >>> still seeing skew up to 16usec and higher. I measured the > > >>> temporal distance between the rounded-epoch vs when ltc > > >>> was actually running to ensure there wasn't some kind of > > >>> bug and found that l_t_c was running up to 150us after the > > >>> round-epoch and sometimes up to 50us before. I guess this > > >>> is the granularity of setting a Xen timer. While it seemed > > >>> that +/- 100us shouldn't cause that much skew, I finally > > >>> decided to try synchronization-via-rendezvous, as suggested > > >>> by Ian here: > > >>> > > >>> > > >> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg > > > 01074.html > > >> > http://lists.xensource.com/archives/html/xen-devel/2008-07/msg 01080.html >> >> The result is phenomenal... using this approach (in attached >> patch), I have yet to see a skew exceed 1usec!!! So this is >> about a 10-fold increase in accuracy vs the rounded-epoch >> method and about 20-fold over the one-epoch-from-NOW() method. >> >> The platform time is now read once for all processors rather >> than once per processor. (Actually, it is read once again >> in platform_time_calibration()... by "inlining" that routine >> into master_local_time_calibration() that extra read can >> be -- and probably should be -- avoided too.) >> >> It may be too late to get this into 3.3.0 but, if so, please >> consider it asap for 3.3.1 rather than just xen-unstable/3.4. >> >> Dan >> >> =================================== >> Thanks... for the memory >> I really could use more / My throughput's on the floor >> The balloon is flat / My swap disk's fat / I've OOM's in store >> Overcommitted so much >> (with apologies to the late great Bob Hope) > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |