[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Problems with cloned OpenSuse 10.3 guest

  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: Glen <gb2@xxxxxxx>
  • Date: Sat, 20 Nov 2010 16:08:29 -0800
  • Delivery-date: Sat, 20 Nov 2010 16:09:57 -0800
  • List-id: Xen user discussion <xen-users.lists.xensource.com>


I'm experiencing some really strange behavior with an OpenSuse 10.3 guest
running in Xen.  Every 48-72 hours, the machine starts running at a very
high load average, dumping tons of messages in the message log, finally
becoming completly inaccessible.  When the guest finally becomes unusable,
the host "xm top" display shows 399% CPU utilzation, and contstant NET 
and VBD activity, but the host cannot even "shutdown" the guest - I have
to destroy it to make it stop.

The host machine is a Dell Poweredge 2950 III server, running OpenSuse 11.1,
64 bit, kernel, and Xen package xen-3.3.1_18546_24-0.4.13 .
It has 20GB of RAM, a quad-core 2GHz Intel CPU, and a Dell Perc5 RAID.  It
runs other guest machines with no problem.

The guest machine is running OpenSuse 10.3, kernel, in
32 bit mode, with Xen package xen-3.1.0_15042-51.3.  

The guest machine is a clone of a running phyical machine that I'm trying to 
virtualize.  I did the creation of the drive, the attach, and so forth, on
the Xen host, then I did an rsync of the 10.3 physical machine's filesystems
onto the 11.1 host.  I removed and reinstalled the Xen kernel package as
suggested on the net, and, against even my predictions, got the guest to
boot.  And it works great... for a few days or so.

But, then, what happens is that the guest starts to go crazy.  I see rapidly
repeating messages like this start to appear in the syslog /var/log/messages:

Nov 20 15:35:55 guestc kernel: b_state=0x00000029, b_size=4096
Nov 20 15:35:55 guestc kernel: device blocksize: 4096
Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. block=210137505, 

Occasionally these messages show up garbled, like this:

Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. 
tate=0x00000029, b_size=4096

And then, of course, I can't even get in to the guest at all, via network
or xm console.  xm shutdown does nothing, and I must xm destroy the guest.

After re-creating the guest, everything runs fine again, until another few
days have passed.

Today I was actually in the guest when this happened.  An rsync was running,
and that process was pegged, with the guest showing a load average of 5.0
from within the guest, and "xm top" showing a usage of 199% (2 of the 4 CPUS?)
I couldn't kill the rsync process, and the messages above were flooding into
the syslog.  The guest could not shut all the way down even with "init 0",
and, eventually, I had to destroy it again.

Here is the machine config:

extra=" "
disk=[ 'file:/a/disks/guestc/disk0,xvda,w', 'phy:sdc1,sdc1,w', ]
vif=[ 'mac=00:16:3e:52:f9:96,bridge=br0', ]

Now, I get that I'm doing some unorthodox things here.  Cloning a physical
machine into a virtual machine.  Running 10.3 as a guest under an 11.1 host.
A 32-bit guest on a 64-bit host.  But the thing DOES run, and I feel like
I'm SO CLOSE to making this work, so I'm really hopeful that someone can
recognize these symptoms and help me find a solution, rather than just 
pointing out the obviously edge-case aspects to this situation here.

Any ideas or guidance would be greatly appreciated!

Thank you!
Glen Barney

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.