[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Vanilla Xen total CPU %



Hi George,

 

Thank you very much for taking the time to respond to me.

 

What I have been trying to do (and I think it’s the same for the other abandoned projects I came across) is come up with an equivalent to something like the Hyper-V % Total Runtime counter, which gives a (probably not very precise) but useful account of the total CPU ‘load’ of a hypervisor.

 

This is the sort of metric which can be useful for spotting overall trends, or when the sum of all virtual machine CPU usage passes an alerting trigger. I appreciate that at this point it would probably be necessary to look at other metrics to determine what was actually happening.

 

What I was trying to do was stream some of the xentop counters into a time series database  (influxdb) so I could graph this. Other people have attempted the same, as an example there are projects doing this with graphite, some people were using old xm python bindings to basically do exactly as you describe in your mail and return their own usage % for graphing.

 

Being able to do this with Go in 4.13 is very interesting and I did not know such bindings existed.  I have some experience with the language so will look at this now.  I am however straddled with some older versions of Xen and had got as far as building a parser and a.) pushing the cpu seconds value for each domU directly into a database and playing with them as if they were a network interface counters and b.) taking samples in a time interval and performing a calculation just as you describe and inserting these directly into a database with a timestamp.

 

One thing I was unsure on - and I think this is my ignorance on how such things are calculated - is how the total CPU capacity of the hypervisor impacted this. For instance, if I have a 20 real core, hyperthreaded hypervisor and a 100 second interval, I guess as an oversimplification there are 4000 CPU seconds are available for execution in that interval? That is when I started to get confused about the how to determine a total % from the info in xentop.

 

Many thanks.

 

 

From: Xen-users <xen-users-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of George Dunlap
Sent: 08 June 2020 11:59
To: Nick Calvert <nick.calvert@xxxxxxxxx>
Cc: George Dunlap <george.dunlap@xxxxxxxxxx>; xen-users <xen-users@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: Vanilla Xen total CPU %

 

 

 

On Thu, Jun 4, 2020 at 8:34 PM Nick Calvert <nick.calvert@xxxxxxxxx> wrote:

Hi everyone,

I am interested in calculating the approximate total CPU runtime % of
a vanilla Xen project host (without any of the bells and whistles of
XCP-ng or Xen Server). What I have at my disposal is Ubuntu, Xen and
the xl tool stack.

 

Just to be clear: What you mean, is you want to add up the time all VMs are running?  (i.e., if you have one VM at 150%, another at 25%, and another at 25%, the total would be 200%?)

 

I have been experimenting with writing a parser for xentop output in
batch mode, this is a fairly easy task and I can see other attempts at
parsers across dead and dying github projects... my issue around this
is the precise meaning of the 'CPU/sec' metric given by xentop and how
I could use it to infer a total cpu time.

 

If you don't mind me asking, what are you (and the projects you mention) using this information for?

 

Xen is open-source project, so rather than having dozens of projects trying to work around the fact that the core tools don't tell them what they want to know, it seems like it would be better to either modify xentop to tell you what you want to know, or add a new tool to do the same thing.

 


The docs for xentop say "CPU(sec) CPU time which the guest OS has
consumed(cumulated)".

My confusion is around how CPU 'seconds' actually relat to vCPUs, real
cores etc in this context. I can also see a couple of attempts at
figuring out total CPU %, but none look quite right.

If I were able to derive both the CPU seconds for each domu in an
interval, the aggregate CPU seconds in this interval and both total
vCPUs and physical cores what would be the correct formula for
approximating a total CPU runtime %?

Also if I am missing a trick and there is an easier way of calculating
this I would be extremely happy to hear it, as simple is nice :)

 

I think if I were writing a program, I'd probably use libxl to get the raw data, rather than trying to parse xentop.  libxl_list_domain() will return a list of libxl_dominfo, which has a field "cpu_time", which is (I believe) the number of nanoseconds of cpu time that domain has consumed ever in its lifetime.

 

So what you'd do is take a timestamp (t1) call libxl_list_domain(), and go through the resulting list, adding up `cpu_time` (c1).  Then at some point later, take a timestamp (t2) and do another sum (c2).  Your total host utilization between t1 and t2 would then be (c2 - c1) / (t2 - t1).

 

If you're using 4.13 at least, you could use the golang bindings instead, if you didn't want to use C. 

 

 -George

 

 

 

 


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.