[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] 99% iowait on one core in 8 core processor



Hello,

On 07/13/12 10:14, Matthias wrote:
> Some updates:
>
> - Deactivating CONFIg_HOTPlUG_CPU and suspend to ram in kernel didn'T
> change anything.
> - I tried to change the smp_affinity to just a single other cpu and
> this worked. So basically I was wrong that there is something changing
> the settings back constantly but linux is simply discarding every
> assignment to more then one cpu (1,2,4, etc works, 3,5,etc is
> discarded)
>
> so the problem is that my linux can not assign irqs to multiple cpus.
> I think this is also why irqbalance does not make a difference because
> it tries the same.
>
> I will check if i can reproduce this behaviour with a stock kernel
> without xen or if this is really xen related in the evening. But if
> you have any other idea what could cause this, I'm open for
> suggestions
>
> @Rajesh can you check with your setup if you have the same case or if
> this is a different problem? simply do a 'cat /proc/interrupts' to
> check the cpu affinity and if everything is done on cpu0 try a 'echo
> "3" > /proc/irq/<some irq number from the other command>/smp_affinity'
> and afterwards check if the smp_affinity now really has '3' as
> content.

I have some experience with some irq affinity problems, but never had
such behaviour.

What I have seen :
- some old server that really balance single irq between all CPUs. (No
idea how, just happy that this server does... could not reproduce it on
other servers I tryed to work with... no irqbalance...)
- all my other servers (VM or not) do NOT balance irq between multiple
CPU when configured to.

I can configure a single irq with smp_affinity that *should* send it to
multiple CPU, but it just goes to one of those (I managed to find
servers where it was first, some where it was last... main difference at
the time was Intel/AMD cpu... not sure if it still stands with more
recent servers).

>From what irqbalance seems to do on our servers, it just check cpu
loads, and irq loads... trying to re-balance them explicitly so often.
The idea is nice, and would probably be a nice thing on a server having
behaviour that often changes... not that great on servers that have
always the same IRQ activity.

For our servers with high IRQ activity, we made a script that explicitly
balance the irq/smp_affinity to something static we want... getting the
heavy interrupts alone on their cpu, and letting the tons of light irq
on the default.

My suggestions would be :
- try to observe CPU usage for each kind of IRQ on your server
- probably a good idea to pin your domU out of dom0 intensive cpu (and
maybe pin your dom0 to just a few CPU... where you would never send domU?)
- balance staticly your IRQ

Now, if you do get massive CPU usage from just ONE irq, you have a
problem, and need to clearly identify what is generating that usage, and
find a way to get more distinct IRQ to handle the same work.

We actually did get that kind of problem with our gateway, it was rx/tx
from our network cards... we changed server, getting network cards that
do get more IRQ per interface.


Regards,
-- 
Adrien Urban

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.