[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] CPU Utilization
Dave Thompson (davetho) wrote: -----Original Message-----From: Andrew Theurer [mailto:habanero@xxxxxxxxxx] Sent: Monday, December 12, 2005 9:24 PMTo: Dave Thompson (davetho) Cc: Anthony Liguori; xen-devel@xxxxxxxxxxxxxxxxxxx Subject: Re: [Xen-devel] CPU UtilizationIf xend is started, you may have the software bridge running which can use as much as 10% cpu.But what else is running? In this case I only have dom0 configured, there is no domU. The only other possibility would be the hypervisor and I hope the hypervisor is not accounting for the other 30%.But I would think that the bridge activity should be showing up in the top CPU summary as well. It is running on domain 0 after all. I know one person suggested that kernel activity is not represented in the top CPU util output. But I don't see how that can be right. If so, where else is that time accounted for? It seems to be all there (in the sy, hi, and si values).Also, I don't see soft ints in that top output. That could also be another ~7% cpu.Soft interrupt time is accounted for in the si field (15%) of the summary. I believe that is where most (if not all) of the TCP processing is performed. Here is the top CPU summary display again: Cpu(s): 1.0% us, 7.3% sy, 0.0% ni, 73.3% id, 0.0% wa, 3.3% hi, 15.0% si Sorry, I overlooked the si. I wonder if this is working under all situations. This problem seems familiar. Before the kernel accounted for si and hi properly, we had a very similar situation with this type of workload: lots of cpu time unaccounted for because the interrupt processing happend mostly when the system was idle, and the timer tick did not account for this properly. I wonder if we have a similar problem in xen/linux. If lost ticks are "queued up" but accounted for just one type of mode, then I think we could be way off in some sitations like this.Also xen is doing some work, receiving the real interrupts and generating virtual interrupts to dom0, so with all this,it is possible that you are using another 30% unseen in top.But aren't the hypervisor calls actually still being accounted for by the domain since clock ticks are not lost but made up for in the timer_interrupt() function of arch/xen/i386/kiernel/time.c? The only issue is really when a domain is preempted by another domain by the xen scheduler and this is actually a problem in the other direction. The swapped out domain will still account for the time in whichever time bucket it was using when the domain was preempted (so the same time is accounted for by both domains). Basically the aggregated CPU time for all domains on a CPU could add greater than 100% because of this. If the domain is re-scheduled because of a SCHEDOP_block in the idle loop, the time will be properly accounted for as idle time. I guess I was hoping to find a smoking gun in xen :). The only other thing I think we could do is count the number of total samples we got over x seconds and compare this with the number of samples we would get in the same time period on a 100% busy system. We should then be able to figure out how much % time the cpu was halted.However, none of this really matters for my case since I am only running domain 0, there is no guest domain. I just want a good explanation why 'xm top' is reporting 30% more CPU utilization than top in this case.Best way to confirm this would be to use xenoprofile.Xenoprof is great for seeing which kernel functions are taking the majority of time but does it really help with CPU utilization? It counts (in the default case) unhalted clock cycles and in the xen idle loop the processor is halted (to save power) so the clock cycles are not accounted for. Is this right or am I missing something. -Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |