[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] HVM Live Migrations Failing 90% Of The Time



I'm deploying a 2-node Pacemaker/DRBD backed Xen cluster to run a
mixture of Linux PVM and Windows HVM VMs. I have this up and running on
a pair of development machines, with both automatic and manual failover
working perfectly. The live migrations work every time for the PVM and
HVM based VMs.

I've replicated the setup onto a pair of high-end live machines, but the
live migrations only succeed around 10% of the time for the HVM VMs. PVM
live migrations complete every time. The configurations on the
development and live machines are identical in every way, except for the
physical hardware.

The migrating host errors with the following when the migration fails:

[2010-04-07 14:42:45 6211] DEBUG (XendCheckpoint:103) [xc_save]:
/usr/lib64/xen/bin/xc_save 30 18 0 0 5
[2010-04-07 14:42:45 6211] INFO (XendCheckpoint:403) xc_save: could not
read suspend event channel
[2010-04-07 14:42:45 6211] WARNING (XendDomainInfo:1617) Domain has
crashed: name=migrating-web id=18.
[2010-04-07 14:42:45 6211] DEBUG (XendDomainInfo:2389)
XendDomainInfo.destroy: domid=18
[2010-04-07 14:42:45 6211] DEBUG (XendDomainInfo:2406)
XendDomainInfo.destroyDomain(18)
[2010-04-07 14:42:48 6211] DEBUG (XendDomainInfo:1939) Destroying device
model
[2010-04-07 14:42:48 6211] INFO (XendCheckpoint:403) Saving memory
pages: iter 1  10%ERROR Internal error: Error peeking shadow bitmap
[2010-04-07 14:42:48 6211] INFO (XendCheckpoint:403) Warning - couldn't
disable shadow modeSave exit rc=1
[2010-04-07 14:42:48 6211] ERROR (XendCheckpoint:157) Save failed on
domain web (18) - resuming.
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/xen/xend/XendCheckpoint.py",
line 125, in save
    forkHelper(cmd, fd, saveInputHandler, False)
  File "/usr/lib/python2.5/site-packages/xen/xend/XendCheckpoint.py",
line 391, in forkHelper
    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib64/xen/bin/xc_save 30 18 0 0 5 failed


With the below also being logged in /var/log/xen/qemu-dm-web.log:

xenstore_process_logdirty_event: key=000000006b8b4567 size=335816
Log-dirty: mapped segment at 0x7fb56c136000
Triggered log-dirty buffer switch


The host that is being migrated to errors with the following:

[2010-04-07 14:42:45 6227] INFO (XendCheckpoint:403) Reloading memory
pages:   0%
[2010-04-07 14:42:48 6227] INFO (XendCheckpoint:403) ERROR Internal
error: Error when reading batch size
[2010-04-07 14:42:48 6227] INFO (XendCheckpoint:403) Restore exit with rc=1
[2010-04-07 14:42:48 6227] DEBUG (XendDomainInfo:2389)
XendDomainInfo.destroy: domid=26
[2010-04-07 14:42:48 6227] DEBUG (XendDomainInfo:2406)
XendDomainInfo.destroyDomain(26)
[2010-04-07 14:42:48 6227] ERROR (XendDomainInfo:2418)
XendDomainInfo.destroy: xc.domain_destroy failed.
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/xen/xend/XendDomainInfo.py",
line 2413, in destroyDomain
    xc.domain_destroy(self.domid)
Error: (3, 'No such process')


Some basic config details:

Xen version:    3.3.0
Kernel:         2.6.24-27-xen
dom0 OS:        Ubuntu 8.04 64-bit
domU OS:        Windows 2008 64-bit


VM config for the above example:

name = "web"
kernel = "/usr/lib/xen/boot/hvmloader"
builder='hvm'
memory = 10240
shadow_memory = 8
vif = [ 'bridge=eth1' ]
acpi = 1
apic = 1
disk = [ 'phy:/dev/drbd0,hda,w', 'phy:/dev/drbd1,hdb,w' ]
device_model = '/usr/lib64/xen/bin/qemu-dm'
boot="dc"
sdl=0
vnc=1
vncconsole=1
vncpasswd='XXXXXXXXXXXX'
serial='pty'
usbdevice='tablet'
vcpus=8
on_poweroff = 'destroy'
on_reboot   = 'restart'
on_crash    = 'destroy'


The DRBD resources are handled by Jefferson Ogata's qemu-dm.drbd wrapper
(http://www.antibozo.net/xen/qemu-dm.drbd) and a slightly modified
version of DRBD's block-drbd script.

The dom0 machines are allocated 1GB of memory each and are identical, in
both software and hardware configurations. Each machine has a total of
24GB of memory.



Thanks


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.