[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] [PATCH] Add a timer mode that disables pending missed ticks
OK, Deepak repeated the test without ntpd and using ntpdate -b before the test. The attached graph shows his results: el5u1-64 (best=~0.07%), el4u5-64 (middle=~0.2%), and el4u5-32 (worst=~0.3%). We will continue to look at LTP to try to isolate. Thanks, Dan P.S. elXuY is essentially RHEL XuY with some patches. > -----Original Message----- > From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx] > Sent: Wednesday, January 30, 2008 2:45 PM > To: Deepak Patel > Cc: dan.magenheimer@xxxxxxxxxx; Keir Fraser; > xen-devel@xxxxxxxxxxxxxxxxxxx; akira.ijuin@xxxxxxxxxx; Dave Winchell > Subject: Re: [Xen-devel] [PATCH] Add a timer mode that > disables pending > missed ticks > > > Dan, Deeepak, > > It may be that the underlying clock error is too great for ntp > to handle. It would be useful if you did not run ntpd > and, instead did ntpdate -b <timeserver> at the start of the test > for each guest. Then capture the data as you have been doing. > If the drift is greater than .05%, then we need to address that. > > Another option is, when running ntpd, to enable loop statistics in > /etc/ntp.conf > by adding this to the file: > > statistics loopstats > statsdir /var/lib/ntp/ > > Then you will see loop data in that directory. > Correlating the data in the loopstats files with the > peaks in skew would be interesting. You will see entries of the form > > 54495 76787.701 -0.045153303 -132.569229 0.020806776 239.735511 10 > > Where the second to last column is the Allan Deviation. When that > gets over 1000, ntpd is working pretty hard. However, I have > not seen ntpd > completely loose it like you have. > > I'm on vacation until Monday, and won't be reading > email. > > Thanks for all your work on this! > > -Dave > > Deepak Patel wrote: > > > > >> > >> Is the graph for RHEL5u1-64? (I've never tested this one.) > > > > > > I do not know which graph was attached with this. But I saw this > > behavior in EL4u5 - 32, EL4U5 - 64 and EL5U1 - 64 hvm guests when I > > was running ltp tests continuously. > > > >> What was the behaviour of the other guests running? > > > > > > All pvm guests are fine. But behavior of most of the hvm guests were > > as described. > > > >> If they had spikes, were they at the same wall time? > > > > > > No. They are not at the same wall time. > > > >> Were the other guests running ltp as well? > >> > > Yes all 6 guests (4 hvm and 2 pvm) the guests are running ltp > > continuously. > > > >> How are you measuring skew? > > > > > > I was collecting output of "ntpdate -q <timeserver> every > 300 seconds > > (5 minutes) and have created graph based on that. > > > >> > >> Are you running ntpd? > >> > > Yes. ntp was running on all the guests. > > > > I am investigating what causes this spikes and let everyone > know what > > are my findings. > > > > Thanks, > > Deepak > > > >> Anything that you can discover that would be in sync with > >> the spikes would be very helpful! > >> > >> The code that I test with is our product code, which is based > >> on 3.1. So it is possible that something in 3.2 other than vpt.c > >> is the cause. I can test with 3.2, if necessary. > >> > >> thanks, > >> Dave > >> > >> > >> > >> Dan Magenheimer wrote: > >> > >>> Hi Dave (Keir, see suggestion below) -- > >>> > >>> Thanks! > >>> > >>> Turning off vhpet certainly helps a lot (though see below). > >>> > >>> I wonder if timekeeping with vhpet is so bad that it should be > >>> turned off by default (in 3.1, 3.2, and unstable) until it is > >>> fixed? (I have a patch that defaults it off, can post it if > >>> there is agreement on the above point.) The whole point of an > >>> HPET is to provide more precise timekeeping and if vhpet is > >>> worse than vpit, it can only confuse users. Comments? > >>> > >>> > >>> In your testing, are you just measuring % skew over a long > >>> period of time? > >>> We are graphing the skew continuously and > >>> seeing periodic behavior that is unsettling, even with pit. > >>> See attached. Though your algorithm recovers, the "cliffs" > >>> could still cause real user problems. I wonder if there is > >>> anything that can be done to make the "recovery" more > >>> responsive? > >>> > >>> We are looking into what part(s) of LTP is causing the cliffs. > >>> > >>> Thanks, > >>> Dan > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx] > >>>> Sent: Monday, January 28, 2008 8:21 AM > >>>> To: dan.magenheimer@xxxxxxxxxx > >>>> Cc: Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx; > >>>> deepak.patel@xxxxxxxxxx; > >>>> akira.ijuin@xxxxxxxxxx; Dave Winchell > >>>> Subject: Re: [Xen-devel] [PATCH] Add a timer mode that disables > >>>> pending > >>>> missed ticks > >>>> > >>>> > >>>> Dan, > >>>> > >>>> I guess I'm a bit out of date calling for clock= usage. > >>>> Looking at linux 2.6.20.4 sources, I think you should specify > >>>> "clocksource=pit nohpet" on the linux guest bootline. > >>>> > >>>> You can leave the xen and dom0 bootlines as they are. > >>>> The xen and guest clocksources do not need to be the same. > >>>> In my tests, xen is using the hpet for its timekeeping and > >>>> that appears to be the default. > >>>> > >>>> When you boot the guests you should see > >>>> time.c: Using PIT/TSC based timekeeping. > >>>> on the rh4u5-64 guest, and something similar on the others. > >>>> > >>>> > (xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer > >>>> > 14.318MHz HPET.) > >>>> > >>>> This appears to be the xen state, which is fine. > >>>> I was wrongly assuming that this was the guest state. > >>>> You might want to look in your guest logs and see what they were > >>>> picking > >>>> for a clock source. > >>>> > >>>> Regards, > >>>> Dave > >>>> > >>>> > >>>> > >>>> > >>>> Dan Magenheimer wrote: > >>>> > >>>> > >>>> > >>>>> Thanks, I hadn't realized that! No wonder we didn't > see the same > >>>>> improvement you saw! > >>>>> > >>>>> > >>>>> > >>>>>> Try specifying clock=pit on the linux boot line... > >>>>>> > >>>>> > >>>>> > >>>>> I'm confused... do you mean "clocksource=pit" on the Xen > >>>> > >>>> > >>>> command line or > >>>> > >>>> > >>>>> "nohpet" / "clock=pit" / "clocksource=pit" on the guest (or > >>>> > >>>> > >>>> dom0?) command > >>>> > >>>> > >>>>> line? Or both places? Since the tests take awhile, it > >>>> > >>>> > >>>> would be nice > >>>> > >>>> > >>>>> to get this right the first time. Do the Xen and guest > >>>> > >>>> > >>>> clocksources need > >>>> > >>>> > >>>>> to be the same? > >>>>> > >>>>> Thanks, > >>>>> Dan > >>>>> > >>>>> -----Original Message----- > >>>>> *From:* Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx] > >>>>> *Sent:* Sunday, January 27, 2008 2:22 PM > >>>>> *To:* dan.magenheimer@xxxxxxxxxx; Keir Fraser > >>>>> *Cc:* xen-devel@xxxxxxxxxxxxxxxxxxx; deepak.patel@xxxxxxxxxx; > >>>>> akira.ijuin@xxxxxxxxxx; Dave Winchell > >>>>> *Subject:* RE: [Xen-devel] [PATCH] Add a timer mode > that disables > >>>>> pending missed ticks > >>>>> > >>>>> Hi Dan, > >>>>> > >>>>> Hpet timer does have a fairly large error, as I was > trying this > >>>>> one recently. > >>>>> I don't remember what I got for error, but 1% sounds > >>>> > >>>> > >>>> about right. > >>>> > >>>> > >>>>> The problem is that hpet is not built on top of vpt.c, > >>>> > >>>> > >>>> the module > >>>> > >>>> > >>>>> Keir and I did > >>>>> all the recent work in, for its periodic timer needs. Try > >>>>> specifying clock=pit > >>>>> on the linux boot line. If it still picks the hpet, which it > >>>>> might, let me know > >>>>> and I'll tell you how to get around this. > >>>>> > >>>>> Regards, > >>>>> Dave > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>>> -------------------------------------------------------------- > >>>> ---------- > >>>> > >>>> > >>>>> *From:* Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > >>>>> *Sent:* Fri 1/25/2008 6:50 PM > >>>>> *To:* Dave Winchell; Keir Fraser > >>>>> *Cc:* xen-devel@xxxxxxxxxxxxxxxxxxx; deepak.patel@xxxxxxxxxx; > >>>>> akira.ijuin@xxxxxxxxxx > >>>>> *Subject:* RE: [Xen-devel] [PATCH] Add a timer mode > >>>> > >>>> > >>>> that disables > >>>> > >>>> > >>>>> pending missed ticks > >>>>> > >>>>> Sorry for the very late followup on this but we finally > >>>> > >>>> > >>>> were able > >>>> > >>>> > >>>>> to get our testing set up again on stable 3.1 bits and have > >>>>> seen some very bad results on 3.1.3-rc1, on the order of 1%. > >>>>> > >>>>> Test enviroment was a 4-socket dual core machine with 24GB of > >>>>> memory running six two-vcpu 2GB domains, four hvm > plus two pv. > >>>>> All six guests were running LTP simultaneously. The four hvm > >>>>> guests were: RHEL5u1-64, RHEL4u5-32, RHEL5-64, and > RHEL4u5-64. > >>>>> Timer_mode was set to 2 for 64-bit guests and 0 for > >>>> > >>>> > >>>> 32-bit guests. > >>>> > >>>> > >>>>> All four hvm guests experienced skew around -1%, > even the 32-bit > >>>>> guest. Less intensive testing didn't exhibit much > skew at all. > >>>>> > >>>>> A representative graph is attached. > >>>>> > >>>>> Dave, I wonder if some portion of your patches > didn't end up in > >>>>> the xen trees? > >>>>> > >>>>> (xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer > >>>>> 14.318MHz HPET.) > >>>>> > >>>>> Thanks, > >>>>> Dan > >>>>> > >>>>> P.S. Many thanks to Deepak and Akira for running tests. > >>>>> > >>>>> > -----Original Message----- > >>>>> > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > >>>>> > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of > >>>>> > Dave Winchell > >>>>> > Sent: Wednesday, January 09, 2008 9:53 AM > >>>>> > To: Keir Fraser > >>>>> > Cc: dan.magenheimer@xxxxxxxxxx; > >>>> > >>>> > >>>> xen-devel@xxxxxxxxxxxxxxxxxxx; Dave > >>>> > >>>> > >>>>> > Winchell > >>>>> > Subject: Re: [Xen-devel] [PATCH] Add a timer mode that > >>>>> > disables pending > >>>>> > missed ticks > >>>>> > > >>>>> > > >>>>> > Hi Keir, > >>>>> > > >>>>> > The latest change, c/s 16690, looks fine. > >>>>> > I agree that the code in c/s 16690 is equivalent to > >>>>> > the code I submitted. Also, your version is more > >>>>> > concise. > >>>>> > > >>>>> > The error tests confirm the equivalence. With > >>>> > >>>> > >>>> overnight cpu loads, > >>>> > >>>> > >>>>> > the checked in version was accurate to +.048% for sles > >>>>> > and +.038% for red hat. My version was +.046% and > +.032% in a > >>>>> > 2 hour test. > >>>>> > I don't think the difference is significant. > >>>>> > > >>>>> > i/o loads produced errors of +.01%. > >>>>> > > >>>>> > Thanks for all your efforts on this issue. > >>>>> > > >>>>> > Regards, > >>>>> > Dave > >>>>> > > >>>>> > > >>>>> > > >>>>> > Keir Fraser wrote: > >>>>> > > >>>>> > >Applied as c/s 16690, although the checked-in patch is > >>>>> > smaller. I think the > >>>>> > >only important fix is to pt_intr_post() and the > only bit of > >>>>> > the patch I > >>>>> > >totally omitted was the change to > pt_process_missed_ticks(). > >>>>> > I don't think > >>>>> > >that change can be important, but let's see what > >>>> > >>>> > >>>> happens to the > >>>> > >>>> > >>>>> error > >>>>> > >percentage... > >>>>> > > > >>>>> > > -- Keir > >>>>> > > > >>>>> > >On 4/1/08 23:24, "Dave Winchell" > >>>> > >>>> > >>>> <dwinchell@xxxxxxxxxxxxxxx> wrote: > >>>> > >>>> > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > >>Hi Dan and Keir, > >>>>> > >> > >>>>> > >>Attached is a patch that fixes some issues with the > >>>> > >>>> > >>>> SYNC policy > >>>> > >>>> > >>>>> > >>(no_missed_ticks_pending). > >>>>> > >>I have not tried to make the change the minimal one, but, > >>>>> > rather, just > >>>>> > >>ported into > >>>>> > >>the new code what I know to work well. The error for > >>>>> > >>no_missed_ticks_pending goes from > >>>>> > >>over 3% to .03% with this change according to my testing. > >>>>> > >> > >>>>> > >>Regards, > >>>>> > >>Dave > >>>>> > >> > >>>>> > >>Dan Magenheimer wrote: > >>>>> > >> > >>>>> > >> > >>>>> > >> > >>>>> > >>>Hi Dave -- > >>>>> > >>> > >>>>> > >>>Did you get your correction ported? If so, it would be > >>>>> > nice to see this get > >>>>> > >>>into 3.1.3. > >>>>> > >>> > >>>>> > >>>Note that I just did some very limited testing with > >>>>> > timer_mode=2(=SYNC=no > >>>>> > >>>missed ticks pending) > >>>>> > >>>on tip of xen-3.1-testing (64-bit Linux hv > guest) and the > >>>>> > worst error I've > >>>>> > >>>seen so far > >>>>> > >>>is 0.012%. But I haven't tried any exotic > loads, just LTP. > >>>>> > >>> > >>>>> > >>>Thanks, > >>>>> > >>>Dan > >>>>> > >>> > >>>>> > >>> > >>>>> > >>> > >>>>> > >>> > >>>>> > >>> > >>>>> > >>>>-----Original Message----- > >>>>> > >>>>From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx] > >>>>> > >>>>Sent: Wednesday, December 19, 2007 12:33 PM > >>>>> > >>>>To: dan.magenheimer@xxxxxxxxxx > >>>>> > >>>>Cc: Keir Fraser; Shan, Haitao; > >>>>> > xen-devel@xxxxxxxxxxxxxxxxxxx; Dong, > >>>>> > >>>>Eddie; Jiang, Yunhong; Dave Winchell > >>>>> > >>>>Subject: Re: [Xen-devel] [PATCH] Add a timer mode that > >>>>> > >>>>disables pending > >>>>> > >>>>missed ticks > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>>Dan, > >>>>> > >>>> > >>>>> > >>>>I did some testing with the constant tsc offset > >>>> > >>>> > >>>> SYNC method > >>>> > >>>> > >>>>> > >>>>(now called > >>>>> > >>>>no_missed_ticks_pending) > >>>>> > >>>>and found the error to be very high, much larger > >>>> > >>>> > >>>> than 1 %, as > >>>> > >>>> > >>>>> > >>>>I recall. > >>>>> > >>>>I have not had a chance to submit a correction. I > >>>> > >>>> > >>>> will try to > >>>> > >>>> > >>>>> > >>>>do it later > >>>>> > >>>>this week or the first week in January. My version of > >>>>> constant tsc > >>>>> > >>>>offset SYNC method > >>>>> > >>>>produces .02 % error, so I just need to port > that into the > >>>>> > >>>>current code. > >>>>> > >>>> > >>>>> > >>>>The error you got for both of those kernels is > >>>> > >>>> > >>>> what I would > >>>> > >>>> > >>>>> expect > >>>>> > >>>>for the default mode, delay_for_missed_ticks. > >>>>> > >>>> > >>>>> > >>>>I'll let Keir answer on how to set the time mode. > >>>>> > >>>> > >>>>> > >>>>Regards, > >>>>> > >>>>Dave > >>>>> > >>>> > >>>>> > >>>>Dan Magenheimer wrote: > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>>>Anyone make measurements on the final patch? > >>>>> > >>>>> > >>>>> > >>>>>I just ran a 64-bit RHEL5.1 pvm kernel and > saw a loss of > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>about 0.2% with no load. This was > xen-unstable tip today > >>>>> > >>>>with no options specified. 32-bit was about 0.01%. > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>>>I think I missed something... how do I run the various > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>accounting choices and which ones are known to be > >>>> > >>>> > >>>> appropriate > >>>> > >>>> > >>>>> > >>>>for which kernels? > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>>>Thanks, > >>>>> > >>>>>Dan > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>-----Original Message----- > >>>>> > >>>>>>From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx > >>>>> > > >>>>> > >>>>>>>>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>Keir Fraser > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>>>>Sent: Thursday, December 06, 2007 4:57 AM > >>>>> > >>>>>>To: Dave Winchell > >>>>> > >>>>>>Cc: Shan, Haitao; > xen-devel@xxxxxxxxxxxxxxxxxxx; Dong, > >>>>> > Eddie; Jiang, > >>>>> > >>>>>>Yunhong > >>>>> > >>>>>>Subject: Re: [Xen-devel] [PATCH] Add a timer > mode that > >>>>> > >>>>>>disables pending > >>>>> > >>>>>>missed ticks > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>Please take a look at xen-unstable changeset 16545. > >>>>> > >>>>>> > >>>>> > >>>>>>-- Keir > >>>>> > >>>>>> > >>>>> > >>>>>>On 26/11/07 20:57, "Dave Winchell" > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>><dwinchell@xxxxxxxxxxxxxxx> wrote: > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>Keir, > >>>>> > >>>>>>> > >>>>> > >>>>>>>The accuracy data I've collected for i/o > loads for the > >>>>> > >>>>>>>various time protocols follows. In > addition, the data > >>>>> > >>>>>>>for cpu loads is shown. > >>>>> > >>>>>>> > >>>>> > >>>>>>>The loads labeled cpu and i/o-8 are on an 8 > >>>> > >>>> > >>>> processor AMD > >>>> > >>>> > >>>>> box. > >>>>> > >>>>>>>Two guests, red hat and sles 64 bit, 8 vcpu each. > >>>>> > >>>>>>>The cpu load is usex -e36 on each guest. > >>>>> > >>>>>>>(usex is available at > >>>>> http://people.redhat.com/anderson/usex.) > >>>>> > >>>>>>>i/o load is 8 instances of dd if=/dev/hda6 > >>>> > >>>> > >>>> of=/dev/null. > >>>> > >>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>The loads labeled i/o-32 are 32 instances of dd. > >>>>> > >>>>>>>Also, these are run on 4 cpu AMD box. > >>>>> > >>>>>>>In addition, there is an idle rh-32bit guest. > >>>>> > >>>>>>>All three guests are 8vcpu. > >>>>> > >>>>>>> > >>>>> > >>>>>>>The loads labeled i/o-4/32 are the same as i/o-32 > >>>>> > >>>>>>>except that the redhat-64 guest has 4 > instances of dd. > >>>>> > >>>>>>> > >>>>> > >>>>>>>Date Duration Protocol sles, rhat error load > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/07 23 hrs 40 min ASYNC -4.96 sec, +4.42 > sec -.006%, > >>>>> > +.005% cpu > >>>>> > >>>>>>>11/09 3 hrs 19 min ASYNC -.13 sec, +1.44 > sec, -.001%, > >>>>> > +.012% cpu > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/08 2 hrs 21 min SYNC -.80 sec, -.34 sec, -.009%, > >>>>> -.004% cpu > >>>>> > >>>>>>>11/08 1 hr 25 min SYNC -.24 sec, -.26 sec, > >>>> > >>>> > >>>> -.005%, -.005% cpu > >>>> > >>>> > >>>>> > >>>>>>>11/12 65 hrs 40 min SYNC -18 sec, -8 sec, > >>>> > >>>> > >>>> -.008%, -.003% cpu > >>>> > >>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/08 28 min MIXED -.75 sec, -.67 sec -.045%, > >>>> > >>>> > >>>> -.040% cpu > >>>> > >>>> > >>>>> > >>>>>>>11/08 15 hrs 39 min MIXED -19. sec,-17.4 > sec, -.034%, > >>>>> > -.031% cpu > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/14 17 hrs 17 min ASYNC -6.1 sec,-55.7 sec, -.01%, > >>>>> > -.09% i/o-8 > >>>>> > >>>>>>>11/15 2 hrs 44 min ASYNC -1.47 sec,-14.0 sec, -.015% > >>>>> > -.14% i/o-8 > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/13 15 hrs 38 min SYNC -9.7 sec,-12.3 sec, -.017%, > >>>>> > -.022% i/o-8 > >>>>> > >>>>>>>11/14 48 min SYNC - .46 sec, - .48 sec, > >>>> > >>>> > >>>> -.017%, -.018% i/o-8 > >>>> > >>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/14 4 hrs 2 min MIXED -2.9 sec, -4.15 sec, -.020%, > >>>>> > -.029% i/o-8 > >>>>> > >>>>>>>11/20 16 hrs 2 min MIXED -13.4 sec,-18.1 > sec, -.023%, > >>>>> > -.031% i/o-8 > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/21 28 min MIXED -2.01 sec, -.67 sec, -.12%, > >>>> > >>>> > >>>> -.04% i/o-32 > >>>> > >>>> > >>>>> > >>>>>>>11/21 2 hrs 25 min SYNC -.96 sec, -.43 sec, -.011%, > >>>>> > -.005% i/o-32 > >>>>> > >>>>>>>11/21 40 min ASYNC -2.43 sec, -2.77 sec -.10%, > >>>> > >>>> > >>>> -.11% i/o-32 > >>>> > >>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>11/26 113 hrs 46 min MIXED -297. sec, 13. sec -.07%, > >>>>> > .003% i/o-4/32 > >>>>> > >>>>>>>11/26 4 hrs 50 min SYNC -3.21 sec, 1.44 sec, -.017%, > >>>>> > .01% i/o-4/32 > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>Overhead measurements: > >>>>> > >>>>>>> > >>>>> > >>>>>>>Progress in terms of number of passes > through a fixed > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>system workload > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>on an 8 vcpu red hat with an 8 vcpu sles idle. > >>>>> > >>>>>>>The workload was usex -b48. > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>ASYNC 167 min 145 passes .868 passes/min > >>>>> > >>>>>>>SYNC 167 min 144 passes .862 passes/min > >>>>> > >>>>>>>SYNC 1065 min 919 passes .863 passes/min > >>>>> > >>>>>>>MIXED 221 min 196 passes .887 passes/min > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>Conclusions: > >>>>> > >>>>>>> > >>>>> > >>>>>>>The only protocol which meets the .05% accuracy > >>>>> > requirement for ntp > >>>>> > >>>>>>>tracking under the loads > >>>>> > >>>>>>>above is the SYNC protocol. The worst case > >>>> > >>>> > >>>> accuracies for > >>>> > >>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>SYNC, MIXED, > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>and ASYNC > >>>>> > >>>>>>>are .022%, .12%, and .14%, respectively. > >>>>> > >>>>>>> > >>>>> > >>>>>>>We could reduce the cost of the SYNC method by only > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>scheduling the extra > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>wakeups if a certain number > >>>>> > >>>>>>>of ticks are missed. > >>>>> > >>>>>>> > >>>>> > >>>>>>>Regards, > >>>>> > >>>>>>>Dave > >>>>> > >>>>>>> > >>>>> > >>>>>>>Keir Fraser wrote: > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>>On 9/11/07 19:22, "Dave Winchell" > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>><dwinchell@xxxxxxxxxxxxxxx> wrote: > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>>Since I had a high error (~.03%) for the > >>>> > >>>> > >>>> ASYNC method a > >>>> > >>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>couple of days ago, > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>I ran another ASYNC test. I think there may have > >>>>> > been something > >>>>> > >>>>>>>>>wrong with the code I used a couple of > days ago for > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>ASYNC. It may have been > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>missing the immediate delivery of interrupt > >>>> > >>>> > >>>> after context > >>>> > >>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>switch in. > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>My results indicate that either SYNC or ASYNC give > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>acceptable accuracy, > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>each running consistently around or under > >>>> > >>>> > >>>> .01%. MIXED has > >>>> > >>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>a fairly high > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>error of > >>>>> > >>>>>>>>>greater than .03%. Probably too close to .05% ntp > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>threshold for comfort. > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>I don't have an overnight run with SYNC. I > >>>> > >>>> > >>>> plan to leave > >>>> > >>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>SYNC running > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>over the weekend. If you'd rather I can > leave MIXED > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>running instead. > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>It may be too early to pick the protocol and > >>>> > >>>> > >>>> I can run > >>>> > >>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>more overnight tests > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>>next week. > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>I'm a bit worried about any unwanted side > >>>> > >>>> > >>>> effects of the > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>SYNC+run_timer > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>approach -- e.g., whether timer wakeups will > >>>> > >>>> > >>>> cause higher > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>system-wide CPU > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>contention. I find it easier to think through the > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>implications of ASYNC. I'm > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>surprised that MIXED loses time, and is less > >>>> > >>>> > >>>> accurate than > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>ASYNC. Perhaps it > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>delivers more timer interrupts than the other > >>>> > >>>> > >>>> approaches, > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>and each interrupt > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>event causes a small accumulated error? > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>Overall I would consider MIXED and ASYNC as > >>>> > >>>> > >>>> favourites and > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>if the latter is > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>actually more accurate then I can simply revert the > >>>>> > changeset that > >>>>> > >>>>>>>>implemented MIXED. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>Perhaps rather than running more of the same > >>>> > >>>> > >>>> workloads you > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>could try idle > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>VCPUs and I/O bound VCPUs (e.g., repeated > >>>> > >>>> > >>>> large disc reads > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>to /dev/null)? We > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>don't have any data on workloads that aren't > >>>> > >>>> > >>>> CPU bound, so > >>>> > >>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>that's really an > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>>>>obvious place to put any further effort imo. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>-- Keir > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> > >>>>> > >>>>>>_______________________________________________ > >>>>> > >>>>>>Xen-devel mailing list > >>>>> > >>>>>>Xen-devel@xxxxxxxxxxxxxxxxxxx > >>>>> > >>>>>>http://lists.xensource.com/xen-devel > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>>> > >>>>> > >>> > >>>>> > >>> > >>>>> > >>> > >>>>> > >>> > >>>>> > >>diff -r cfdbdca5b831 xen/arch/x86/hvm/vpt.c > >>>>> > >>--- a/xen/arch/x86/hvm/vpt.c Thu Dec 06 15:36:07 > 2007 +0000 > >>>>> > >>+++ b/xen/arch/x86/hvm/vpt.c Fri Jan 04 17:58:16 > 2008 -0500 > >>>>> > >>@@ -58,7 +58,7 @@ static void > pt_process_missed_ticks(stru > >>>>> > >> > >>>>> > >> missed_ticks = missed_ticks / (s_time_t) > >>>> > >>>> > >>>> pt->period + 1; > >>>> > >>>> > >>>>> > >> if ( mode_is(pt->vcpu->domain, > >>>> > >>>> > >>>> no_missed_ticks_pending) ) > >>>> > >>>> > >>>>> > >>- pt->do_not_freeze = !pt->pending_intr_nr; > >>>>> > >>+ pt->do_not_freeze = 1; > >>>>> > >> else > >>>>> > >> pt->pending_intr_nr += missed_ticks; > >>>>> > >> pt->scheduled += missed_ticks * pt->period; > >>>>> > >>@@ -127,7 +127,12 @@ static void pt_timer_fn(void *data) > >>>>> > >> > >>>>> > >> pt_lock(pt); > >>>>> > >> > >>>>> > >>- pt->pending_intr_nr++; > >>>>> > >>+ if ( mode_is(pt->vcpu->domain, > >>>> > >>>> > >>>> no_missed_ticks_pending) ) { > >>>> > >>>> > >>>>> > >>+ pt->pending_intr_nr = 1; > >>>>> > >>+ pt->do_not_freeze = 0; > >>>>> > >>+ } > >>>>> > >>+ else > >>>>> > >>+ pt->pending_intr_nr++; > >>>>> > >> > >>>>> > >> if ( !pt->one_shot ) > >>>>> > >> { > >>>>> > >>@@ -221,8 +226,6 @@ void pt_intr_post(struct > vcpu *v, struct > >>>>> > >> return; > >>>>> > >> } > >>>>> > >> > >>>>> > >>- pt->do_not_freeze = 0; > >>>>> > >>- > >>>>> > >> if ( pt->one_shot ) > >>>>> > >> { > >>>>> > >> pt->enabled = 0; > >>>>> > >>@@ -235,6 +238,10 @@ void pt_intr_post(struct vcpu > >>>> > >>>> > >>>> *v, struct > >>>> > >>>> > >>>>> > >> pt->last_plt_gtime = hvm_get_guest_time(v); > >>>>> > >> pt->pending_intr_nr = 0; /* 'collapse' all > >>>>> > missed ticks */ > >>>>> > >> } > >>>>> > >>+ else if ( mode_is(v->domain, > no_missed_ticks_pending) ) { > >>>>> > >>+ pt->pending_intr_nr--; > >>>>> > >>+ pt->last_plt_gtime = hvm_get_guest_time(v); > >>>>> > >>+ } > >>>>> > >> else > >>>>> > >> { > >>>>> > >> pt->last_plt_gtime += pt->period_cycles; > >>>>> > >> > >>>>> > >> > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > >>>>> > > >>>>> > _______________________________________________ > >>>>> > Xen-devel mailing list > >>>>> > Xen-devel@xxxxxxxxxxxxxxxxxxx > >>>>> > http://lists.xensource.com/xen-devel > >>>>> > > >>>>> > >>>>> > >>>> > >>>> > >> > > Attachment:
hvm-compare.png _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |