[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [PATCH] rendezvous-based local time calibration WOW!
Thanks, Dan! Of course, there are new features since 3.2 that I did not include in by version-number-change announcement email. I'll make a suitably updated list for the actual 4.0 release announcement. -- Keir On 4/8/08 20:40, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote: > After two hours of constant samples with c/s 18229, max > skew is at 251ns! That's 70-150x better than I was > measuring just a couple of weeks ago. YMMV of course. > > If you are looking for another marketing-speak bullet for > the 4.0 release announcement, you can call this: > > * Greatly improved precision for time-sensitive SMP VMs > > or as I am subject to American hyperbole: > > * Dramatically improved precision for time-sensitive SMP VMs > > Thanks again! > Dan > >> -----Original Message----- >> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] >> Sent: Monday, August 04, 2008 11:37 AM >> To: 'Keir Fraser'; 'Xen-Devel (E-mail)' >> Cc: 'Ian Pratt'; 'Dave Winchell' >> Subject: RE: [PATCH] rendezvous-based local time calibration WOW! >> >> >> Looks good to me (and much cleaner). I've booted it and >> will leave it running for a few hours. >> >> Thanks! >> Dan >> >>> -----Original Message----- >>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] >>> Sent: Monday, August 04, 2008 11:10 AM >>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail) >>> Cc: Ian Pratt; Dave Winchell >>> Subject: Re: [PATCH] rendezvous-based local time calibration WOW! >>> >>> >>> Applied as c/s 18229. I rewrote it quite a bit, although >> the principle >>> remains the same. >>> >>> -- Keir >>> >>> On 4/8/08 16:24, "Dan Magenheimer" >> <dan.magenheimer@xxxxxxxxxx> wrote: >>> >>>> OK, how about this version. The rendezvous only collects >>>> the key per-cpu time data then sets up a per-cpu 1ms timer >>>> to later update the timestamp record and vcpu system time, >>>> so neither should have racing issues. >>>> >>>> I've only run it for about an hour but still haven't seen >>>> any skew over 600nsec so apparently it is the collection of >>>> the key time data that must be closely synchronized (probably >>>> to ensure the slope is correct) while exact synchronization >>>> of setting the timestamp records is less important. >>>> >>>> Note that I'm not positive I got the clocksource=tsc part >>>> correct... but am interested in your opinion on whether >>>> clocksource=tsc can now be eliminated anyway (as the >>>> main reason I pushed for it was because of unacceptable >>>> skew which with this patch appears to be fixed). >>>> >>>> Signed-off-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> >>>> >>>>> -----Original Message----- >>>>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] >>>>> Sent: Sunday, August 03, 2008 11:25 AM >>>>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail) >>>>> Cc: Ian Pratt; Dave Winchell >>>>> Subject: Re: [PATCH] rendezvous-based local time calibration WOW! >>>>> >>>>> >>>>> It's not safe to poke a new timestamp record from an >>> interrupt handler >>>>> (which is what the smp_call_function() callback functions >>>>> are). Users of the >>>>> timestamp records (e.g., get_s_time) need >>>>> local_irq_save/restore() or an >>>>> equivalent of the Linux seqlock. The latter is likely faster. >>>>> I'm dubious >>>>> about update_vcpu_system_time() from an interrupt handler >>>>> too. It needs >>>>> thought about how it might race with a context switch (change >>>>> of 'current') >>>>> or if it interrupts an existing invocation of >>>>> update_vcpu_system_time(). >>>>> >>>>> -- Keir >>>>> >>>>> On 3/8/08 17:50, "Dan Magenheimer" >>> <dan.magenheimer@xxxxxxxxxx> wrote: >>>>> >>>>>> The synchronization of local_time_calibration (l_t_c) via >>>>>> round-to-nearest-epoch provided some improvement, but I was >>>>>> still seeing skew up to 16usec and higher. I measured the >>>>>> temporal distance between the rounded-epoch vs when ltc >>>>>> was actually running to ensure there wasn't some kind of >>>>>> bug and found that l_t_c was running up to 150us after the >>>>>> round-epoch and sometimes up to 50us before. I guess this >>>>>> is the granularity of setting a Xen timer. While it seemed >>>>>> that +/- 100us shouldn't cause that much skew, I finally >>>>>> decided to try synchronization-via-rendezvous, as suggested >>>>>> by Ian here: >>>>>> >>>>>> >>>>> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg >>>> 01074.html >>>>> >> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg > 01080.html >>> >>> The result is phenomenal... using this approach (in attached >>> patch), I have yet to see a skew exceed 1usec!!! So this is >>> about a 10-fold increase in accuracy vs the rounded-epoch >>> method and about 20-fold over the one-epoch-from-NOW() method. >>> >>> The platform time is now read once for all processors rather >>> than once per processor. (Actually, it is read once again >>> in platform_time_calibration()... by "inlining" that routine >>> into master_local_time_calibration() that extra read can >>> be -- and probably should be -- avoided too.) >>> >>> It may be too late to get this into 3.3.0 but, if so, please >>> consider it asap for 3.3.1 rather than just xen-unstable/3.4. >>> >>> Dan >>> >>> =================================== >>> Thanks... for the memory >>> I really could use more / My throughput's on the floor >>> The balloon is flat / My swap disk's fat / I've OOM's in store >>> Overcommitted so much >>> (with apologies to the late great Bob Hope) >> >> > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |