[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] CPU Usage Discrepancies


  • To: "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>
  • From: "Pradeep Vincent" <pradeep.vincent@xxxxxxxxx>
  • Date: Tue, 6 Mar 2007 17:11:23 -0800
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 06 Mar 2007 17:10:27 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=W+xeGyp+Ja9tXMiW3VXGUCD34ZT03qNnNaFxtiNwJ37N/d02nJCDcHbDSXYmXR2VN4b8x2IGM7bLmpOkvESrcdQZPCbPHi5U356ppU/Xf3zvadFM+ko5nE88MP8GaU9vBoZGhWgko+1ntewX2QMCjQEY0yn9D06BaYID05EjSh0=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

With a trivial workload like "ls -R /" I see as much as 30% diff and
with other workloads I see that xm reports twice what /proc/stat
reports. Sounds too high to me.

Linux counts all the nanosecs not accounted by the hypervisor towards
"stolen" or "blocked" as its own usage. This should include all the
time spent in the hypervisor in the context of a particular Vcpu - The
hypervisor counts nsecs as "stolen" or "blocked" only after the Vcpu's
state is changed (from running to something else)  So most part of the
hypervisor's CPU usage should be accounted for the same way by xm and
by /proc/stat on guests as they both use the same "stolen" and
"blocked" nsecs as accounted for and maintained by the hypervisor.

Like you said context switch overhead isn't accounted for accurately
but hypervisor's cpu usage accounting suffers from the same problem
and to the same extent. Even if this isn't the case,  context switch
cpu usage can't account for this big a difference.

- Pradeep Vincent

On 3/4/07, Keir Fraser <Keir.Fraser@xxxxxxxxxxxx> wrote:
Does your /proc/stat analysis include time spent in the kernel?

Another possibility here is that, if your guest blocks a lot, you will see
that Linux counts the guest as 'running' for less of the context-switch path
than Xen does. This will cause Linux's estimate of time used to be less than
Xen's. There's not much to be done about that: in general Xen has more
knowledge of what is actually going on, including precisely when a switch of
control happens, and the numbers from xentop will be more accurate than
numbers generated by the guest itself (particularly with frequently-blocking
workloads). Although it depends on what you're interested in measuring -- if
you care about the amount of time spent doing useful application work (as
opposed to context switching) then you might be more interested in the Linux
stats because Xen will include more time spent in the Linux and Xen context
switch paths.

 -- Keir

On 2/3/07 23:42, "Pradeep Vincent" <pradeep.vincent@xxxxxxxxx> wrote:

> I see serious discrepancies between Cpu usage as reported by /proc/stat on
> Xen3
> virts and Cpu usage as reported by the hypervisor via "xm" tool
> (cpu_time). The problem exists on Intel and AMD platforms - 1 Vcpu and
> multiple Vcpu slots - 1 Physical CPU and multiple Physical CPU hosts.
>
> The skew is pronounced with workloads that "sleep-wake-sleep-wake" at
> a high frequency while workloads that hog the CPU don't exhibit this
> problem as much.
>
> Anybody seen this ? Any insights ?
>
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=882 has all the
> details.
>
> - Pradeep Vincent
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.