[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Random I/O deadlocks in multiple clouds



Greetings,

I have two clouds, one running XCP 0.5 and another one running XCP 1.0.

Since a few weeks i'm having problems on both of them:

At one pseudo-random moment one or more of the domUs write this on the
console:

INFO: task [random program] blocked for more than 120 seconds.

The log instead is filled with stacktraces:

http://pastebin.com/ziyyWEXP

>From then on, the VM becomes extremely lagged if not at all unreachable.

The trace suggests it's an I/O problem, but the crashes don't seem to
follow a pattern: they happen during high as well as low I/O traffic,
high/low cpu load, high/low memory usage.

The same thing happens on all dom0s of both my clouds.

The domUs are all running PVOPS enabled kernels (2.6.32+), in a mix of
vanilla+grsec, debian stock and debian backports (lenny/squeeze).

I'm keeping the dom0s under monitoring, but nothing specific seems to
happen during the domU crashes - nothing in xe host-dmesg, nothing in
the graphs.

At this point i'm quite lost, i have no idea how to further debug the
issue.

Does anyone have any suggestions?

Thank you in advance

Sincerely,

--
Alessandro



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.