Xen project Mailing List

Re: [Xen-devel] [PATCH] Add a timer mode that disables pending missed ticks

To: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>

From: Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>

Date: Thu, 08 Nov 2007 09:57:57 -0500

Cc: "Shan, Haitao" <haitao.shan@xxxxxxxxx>, Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>

Delivery-date: Mon, 19 Nov 2007 10:21:00 -0800

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Keir, I ran a 24 hour (23hr:40min) test. The usual setup. Protocol was ASYNC. Errors: sles9sp3-64: -4.96 sec -.0058% rh4u4-64: +4.42 sec +.0052% So, lets leave it ASYNC unless someone produces some test cases where the error gets up to close to .05%. I'll do some testing here with overnight runs or, perhaps, different loads. thanks, Dave Dave Winchell wrote:

Hi Keir,

I've added comments below.
See my next mail on some interesting performance numbers.

thanks,
Dave

Keir Fraser wrote:

On 7/11/07 19:38, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:
My feeling is that we should go full SYNC. Yes, in theory the
guests should be able to handle ASYNC, but in reality it appears that
some do not. Since it is easy for us to give them SYNC,
lets just do it and not stress them out.
One problem with pure SYNC is there's a fair chance you won't deliveranyticks at all for a long time, if the guest only runs in short bursts(e.g.,I/O bound) and happens not to be running on any tick boundary. I'mnot sure
how much that matters. It could cause time goes backwards if the time
extrapolation via the TSC is not perfectly accurate, or causeproblems if
there are any assumptions that TSC delta since last tick fits in 32 bits
(less likely in x64 code I suppose). Anyway, my point is that onlytesting
VCPUs under full load may cause us to optimise in ways that have nasty
unexpected effects for other workloads.

I agree that this could be a problem. I have an idea that could giveus full

SYNC and eliminate the long periods without clock interrupts.
In pt_process_missed_ticks() when missed_ticks > 0 set pt->run_timer = 1.
In pt_save_timer():

   list_for_each_entry ( pt, head, list )
       if(!pt->run_timer)
            stop_timer(&pt->timer);

And in pt_timer_fn():

   pt->run_timer = 0;

So, for a guest that misses a tick, we will interrupt him once from the
descheduled state and then leave him alone in the descheduled state.

For default mode as checked into unstable is now,
64 bit guests should run quite fast as missed is calculated and thena bunch
of additional interrupts are delivered. On the other hand
32bit guests very well in default mode.
For the original code, before we put in the constant tsc offsetbusiness,
64bit guests run poorly and 32bit quests very well time-wise.
The default mode hasn't changed. Are you under the impression that
missed-ticks-but-no-delay-of-tsc is the default mode now? I know x64guestsrun badly with that because they treat every one of the missed ticksthey
receive as a full tick.

Sorry, I was confused.
However, the default mode will still run poorly for 64 bit guests because
of the pending_nr's accumulated while the guest has interrupts disabled.
As I recall, the effect is quite large, on the order of 10% error.
I'll get you a number later today.

-- Keir
Or is the lack of
synchronization of TSCs across VCPUs causing issues that you'retrying to
avoid?
This does cause issues, but its not the only contributor to poortiming.
Having TSCs synchronized across vcpus will help some of the time going
backwards problems we have seen, I think.

Regards,
Dave

Keir Fraser wrote:
On 7/11/07 17:29, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx> wrote:
So, you can see we send an interrupt immediately (and ASYNC) ifany tickshave been missed, but then successive ticks are delivered 'on thebeat'. Apossible middleground? Or perhaps we should just go with SYNCafter all...
How do these Linux x64 guests fare with the original and defaulttimer mode,by the way? I would expect that time should be accounted prettyaccuratelyin that mode, albeit with more interrupts than you'd like. Or isthe lack ofsynchronisation of TSCs across VCPUs causing issues that you'retrying to
avoid?

-- Keir

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.