[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Xen system hang or freeze


  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: thomas morgan <tm@xxxxxxxxxx>
  • Date: Sun, 5 Apr 2009 14:33:36 -0600
  • Delivery-date: Sun, 05 Apr 2009 14:57:16 -0700
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Over the last year, I've experienced a couple of sources of lockups.

The first was resolved by going to the stock xen 2.6.18.8 kernel compiled from source (had been using the Debian etch kernel; found commentary online describing the same symptoms on Ubuntu, Redhat, and CentOS though, each with their distro-specific kernel).

This one tended to result in kernel oops messages--soft IRQ lockups as I recall. Lockup would start with a domU and within a few minutes would kill the dom0 too. The fastest way to trigger this one was to create and shutdown domU's, although I don't recall that being the only way.

The second, with the stock kernel, was an errant USB hub attached to a xen host. Removing the hub resolved the issue. These were complete, sudden lockups of the dom0 and all domUs -- basically everything. Higher traffic over the USB port would trigger this lockup.

So, for those who haven't tried the stock xen kernel, and are able to try it (based on driver support, etc.), it might help.

--t


On Apr 5, 2009, at 1:20 PM, Martin Fernau wrote:

I tried this before. I had your kernel a few months but this changed nothing.
I had freezed with this kernel too in the same way.


Am Sonntag, 5. April 2009 14:29:20 schrieb Andrew Lyon:
On Sat, Apr 4, 2009 at 4:00 PM, Martin Fernau <m.fernau@xxxxxxxxxx> wrote:
Hi,

I just want to tell you that I've the same issue for one server!
Hardware:
Fujitsu Siemens PRIMERGY TX200 S4
CPU: Intel Xeon Dual Quad E5405 2GhZ
Hardware Raid: LSI Logic / Symbios Logic MegaRAID SAS 1078 with SAS HDDs
I'm running 4 guests on it:
- Win2003
- Win2003
- Gentoo Linux
- Windows XP Prof

The xen 3.3.0 is running on a gentoo with a 2.6.18-xen-r12 kernel.

I had the same problem with the 2.6.18-xen-rX Gentoo kernels, so I
made my own ebuild and patches from the openSUSE Xen patches, you can
get it from http://code.google.com/p/gentoo-xen-kernel/downloads/list

Andy

The systems hangs round about all 3-4 weeks as far as I can tell. This server is quite new (from nov 2008) and the ServerView doesn't tell me
anything about hardware problems. It seems from this point that the
hardware is ok. If the server hangs then it's not responsive for any kind
of input. Neither the network is working (ping to dom0 or one of the
guests) nor
keyboard/monitor of the server itself is responding to anything. Black screen.. nothing more. A hard reset is the only thing to get the system
back to life.

/var/log/messages just show nothing. It's like disconnecting the power
cable. I have no idea and no hints about this problem.
At the moment I've a cronjob running which collects some system
informations of the dom0 every minute - I hope that the very last run
(just before the next crash happens) will show me some kind of
informations which maybe point me to the problem!? However - I currently have no clue which kind of informations will be helpful for this purpose.
I currently log the following things every minute:
- dmesg
- free
- netstat -lnp
- ps aux
- w
- vgdisplay
- lvdisplay

hints about other informations which could be helpful? any xen related
commands?

Interesting that you use lvm too. I also use lvm for my guests and use the snapshot functionality on a daily basis to backup the server to a
tape. dom0 is running on a normal partition. I use lvm 2.02.36

Regards,
Martin

Am Freitag, 3. April 2009 16:56:28 schrieb Paraic Gallagher:
Hi all,

This is my first post to the list, I hope someone out there can help!

I am running xen 3.0.3, with CentOS 5.2 based Dom0
(kernel-xen-2.6.18-92.1.22.el5)

Recently I have noticed some complete system lockups on a few different servers. Neither Dom0 or any of the guests respond to pings, connecting a keyboard and monitor to the system only shows a blank screen. Nothing
is written to logs at time of lockup.

The problem is very difficult to reproduce and seems very random by
nature. Sometimes if a system is left running for a few weeks it will happen, other times it can happen after a reboot. I have tried taxing
the system running various scripts, rebooting numerous times, and
creating/destroying a few guests, etc but no luck. It seems like a
hardware issue but has been reproduced on a few different machines.

For a while (clutching at straws) I thought it was due to changes in the
clock (from daylight savings) so tried changing time backwards and
forwards but this had no effect.

Has anyone else out there seen a problem like this? Is there any way to
diagnose it when it does happen. (It is very frustrating to have a
hanged system where you cannot access for any information).

If anyone wants any further info or ideas on what I could try please let
me know.

Regards,
Paraic.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.