[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] cpuidle causing Dom0 soft lockups



On large systems and with Dom0 booting with (significantly) more than
32 vCPU-s we have got multiple reports that the now by default
enabled C-state management is causing soft lockups, usually preventing
the boot from completing.

The observations are:

Reducing the number of vCPU-s (or pCPU-s) sufficiently much makes
the systems work.

max_cstate=0 makes the systems work.

max_cstate=1 makes the problem less severe on one (bigger) system,
and eliminates it completely on another (smaller) one.

When appearing to hang, all vCPU-s are in Dom0's timer_interrupt(),
and all (sometimes all but one) are attempting to acquire xtime_lock.
However, due to our use of ticket locks we can verify that this is not
a deadlock (repeatedly sending '0' shows forward progress, as the
tickets [visible on the stack] continue to increase). Additionally, there
is always one vCPU that has its polling event channel (used for
waking the next waiting vCPU when a lock becomes available)
signaled.

In one case (but not in the other) it is always the same vCPU that
is apparently taking very long to wake up from the polling request.
This may be coincidence, but output after sending 'c' also indicates
a significantly higher (about 3 times) usage value for C2 than the
second highest one; the duration printed is roughly the same for
all CPUs.

While I don't know this code well, it would seem that we're suffering
from extremely long wakeup times. This suggests that there likely is
a (performance) problem even for smaller numbers of vCPU-s.
Hence, unless it can be fixed before 4.0 releases, I would suggest
disabling C-state management by default again.

I can provide full logs in case needed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.