[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] time accounting problem in pvops kernel



 On 08/17/2010 10:29 AM, Paolo Bonzini wrote:
> Hi,
>
> while experimenting a bit with time.c we found a bug in time
> accounting.  Basically, /proc/stat counts idle time twice for PV guests
> running a pvops kernel

What version?  Upstream and stable kernels contain the changeset "xen:
drop xen_sched_clock in favour of using plain wallclock time" which
should fix a lot of timekeeping/scheduling problems.

Thanks,
    J

> .
>
> To reproduce, try this command in an unloaded guest:
>
> grep cpu0 /proc/stat; sleep 20 ; grep cpu0 /proc/stat
>
> and see the fourth number in /proc/stat (idle) increasing by approximately
> 4000 for a kernel with USER_HZ == 100. Instead, if you try these commands
> instead (you need an otherwise unloaded machine for these):
>
> grep cpu0 /proc/stat; timeout 20s yes > /dev/null ; grep cpu0 /proc/stat
> grep cpu0 /proc/stat; timeout 20s dd if=/dev/urandom > /dev/null ; grep cpu0 
> /proc/stat
>
> the first and third number in the /cpu/stat increase instead by 2000 only.
>
> The reason for this seems to be that in xen_timer_interrupt Linux's
> normal timer accounting is called (via evt->event_handler) and this
> calls account_idle_time. However, idle ticks are also added from
> do_stolen_accounting, so that overall they're counted twice.
>
> Related to this, it looks like stolen tick accounting is subtly
> wrong. Even if only part of a tick is stolen by the hypervisor, Linux's
> time accounting will add a whole tick to the user/system/idle time. In
> a dynticks kernel (or maybe even if the scheduling quanta have some
> kind of resonance with the guest's timer interrupt?) the sum of the
> four components user+sys+idle+steal will then be larger than the wall
> time. In fact, I found experimentally steal time to be usually 20%
> off from wall-user-sys-idle when the machine is under moderate load
> (e.g. 5 domains at 100% CPU usage, on a 4-CPU machine). Of course I used
> the correct, divided-by-2 idle time to do this computation.
>
> Paolo
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.