[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] Add a timer mode that disables pending missed ticks
Keir, I ran a 24 hour (23hr:40min) test. The usual setup. Protocol was ASYNC. Errors: sles9sp3-64: -4.96 sec -.0058% rh4u4-64: +4.42 sec +.0052% So, lets leave it ASYNC unless someone produces some test cases where the error gets up to close to .05%. I'll do some testing here with overnight runs or, perhaps, different loads. thanks, Dave Dave Winchell wrote: Hi Keir, I've added comments below. See my next mail on some interesting performance numbers. thanks, Dave Keir Fraser wrote:I agree that this could be a problem. I have an idea that could give us fullOn 7/11/07 19:38, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:My feeling is that we should go full SYNC. Yes, in theory the guests should be able to handle ASYNC, but in reality it appears that some do not. Since it is easy for us to give them SYNC, lets just do it and not stress them out.One problem with pure SYNC is there's a fair chance you won't deliver any ticks at all for a long time, if the guest only runs in short bursts (e.g., I/O bound) and happens not to be running on any tick boundary. I'm not surehow much that matters. It could cause time goes backwards if the timeextrapolation via the TSC is not perfectly accurate, or cause problems ifthere are any assumptions that TSC delta since last tick fits in 32 bits(less likely in x64 code I suppose). Anyway, my point is that only testingVCPUs under full load may cause us to optimise in ways that have nasty unexpected effects for other workloads.SYNC and eliminate the long periods without clock interrupts. In pt_process_missed_ticks() when missed_ticks > 0 set pt->run_timer = 1. In pt_save_timer(): list_for_each_entry ( pt, head, list ) if(!pt->run_timer) stop_timer(&pt->timer); And in pt_timer_fn(): pt->run_timer = 0; So, for a guest that misses a tick, we will interrupt him once from the descheduled state and then leave him alone in the descheduled state.For default mode as checked into unstable is now,64 bit guests should run quite fast as missed is calculated and then a bunchof additional interrupts are delivered. On the other hand 32bit guests very well in default mode.For the original code, before we put in the constant tsc offset business,64bit guests run poorly and 32bit quests very well time-wise.The default mode hasn't changed. Are you under the impression thatmissed-ticks-but-no-delay-of-tsc is the default mode now? I know x64 guests run badly with that because they treat every one of the missed ticks theyreceive as a full tick.Sorry, I was confused. However, the default mode will still run poorly for 64 bit guests because of the pending_nr's accumulated while the guest has interrupts disabled. As I recall, the effect is quite large, on the order of 10% error. I'll get you a number later today.-- KeirOr is the lack ofsynchronization of TSCs across VCPUs causing issues that you're trying toavoid?This does cause issues, but its not the only contributor to poor timing.Having TSCs synchronized across vcpus will help some of the time going backwards problems we have seen, I think. Regards, Dave Keir Fraser wrote:On 7/11/07 17:29, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx> wrote:So, you can see we send an interrupt immediately (and ASYNC) if any ticks have been missed, but then successive ticks are delivered 'on the beat'. A possible middleground? Or perhaps we should just go with SYNC after all...How do these Linux x64 guests fare with the original and default timer mode, by the way? I would expect that time should be accounted pretty accurately in that mode, albeit with more interrupts than you'd like. Or is the lack of synchronisation of TSCs across VCPUs causing issues that you're trying toavoid? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |