[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Live migration on xen 4.0.1 fails


  • To: dokter@xxxxxxxxxxxxx
  • From: Todd Deshane <todd.deshane@xxxxxxx>
  • Date: Wed, 13 Apr 2011 05:13:39 +0000
  • Cc: xen-users@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 12 Apr 2011 22:15:19 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; b=uDl2/QXhF4Z7ec64TVIYGkn7bxTPh1bpMogY0mxrRsdGMqZGXJUA1t76R8/wHP7PdR 7AVmSdj3KrCrBJdhPH3FWxjBCAhwZJaY/uMOULDqiWbYUPuKoYMefKbRuyysRUnS1arQ yhOqWgjr/E2luyp4UTATs/ziBKRaDDTAnZUfA=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

On Tue, Apr 12, 2011 at 10:39 AM, Mark Dokter <dokter@xxxxxxxxxxxxx> wrote:
> Dear xen-users!
>
> I am running Xen 4.0.1 with 2.6.32.x pvops dom0 kernels from the git
> stable branch for several months now.
> Every now and then I gave the migration feature a try, but I failed to
> migrate a machine up to now. I did not have the time to look into the
> errors closely, but since I didn't read about migration beeing
> impossible on this list, I obviously must do something wrong :(
>
> My setup:
> - 2 identical servers with ubuntu 10.04 server and self packaged kernel
> and xen (inspired by [1])
> - the storage for the dom U is a DRBD volume on top of lvm2
> - the servers are connected via a dedicated gbit link
>
> The domU config:
> name    = "test1";
> memory  = 1024;
> vcpus   = 1;
> builder = 'hvm'
> kernel = "/usr/lib/xen/boot/hvmloader"
> boot = "c"
> pae = 1
> acpi = 1
> apic = 1
> localtime = 1
> on_poweroff = "destroy"
> on_reboot = "restart"
> on_crash = "destroy"
> device_model = "/usr/lib/xen/bin/qemu-dm"
> vnc = 1
> vfb = [ "type=vnc,vncunused=1,vnclisten=0.0.0.0,keymap=en-us" ]
> disk = [ 'phy:/dev/drbd9,xvda,w']
> vif = [ "bridge=br0, mac=00:16:3e:00:01:0d" ];
> serial = "pty"
>
> I start the vm on server xen02 and do a
> xm migrate --live test1 xen01
>
> What I get in the xen02 xend.log:
>
> [2011-04-11 12:58:04 2612] DEBUG (XendCheckpoint:124) [xc_save]:
> /usr/lib/xen/bin/xc_save 29 7 0 0 5
> [2011-04-11 12:58:04 2612] INFO (XendCheckpoint:423) xc_save: failed to
> get the suspend evtchn port
> [2011-04-11 12:58:04 2612] INFO (XendCheckpoint:423)
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) Saving memory
> pages: iter 1   0%^H^H^H^H  5%^H^H^H^H 10%^H^H^H^H 15%^H^H^H^H
> 20%^H^H^H^H 25%^H^H^H^H 30%^H^H^H^H 35%^H^H^H^H 40%^H^H^H^H 45%^H^H^H^H
> 50%^H^H^H^H 55%^H^H^H^H 60%^H^H^H^H 65%^H^H^H^H 70%^H^H^H^H 75%^H^H^H^H
> 80%^H^H^H^H 85%^H^H^H^H 90%^H^H^H^H 95%^M 1: sent 265216, skipped 1355,
> delta 13794ms, dom0 0%, target 0%, sent 630Mb/s, dirtied 3Mb/s 1431 pages
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) Saving memory
> pages: iter 2   0%^H^H^H^H 23%^M 2: sent 1416, skipped 15, delta 40ms,
> dom0 0%, target 0%, sent 1159Mb/s, dirtied 123Mb/s 151 pages
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) Saving memory
> pages: iter 3   0%^M 3: sent 151, skipped 0, delta 12ms, dom0 0%, target
> 0%, sent 412Mb/s, dirtied 49Mb/s 18 pages
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) Saving memory
> pages: iter 4   0%^M 4: sent 18, skipped 0, Start last iteration
> [2011-04-11 12:58:18 2612] DEBUG (XendCheckpoint:394) suspend
> [2011-04-11 12:58:18 2612] DEBUG (XendCheckpoint:127) In
> saveInputHandler suspend
> [2011-04-11 12:58:18 2612] DEBUG (XendCheckpoint:129) Suspending 7 ...
> [2011-04-11 12:58:18 2612] DEBUG (XendDomainInfo:519)
> XendDomainInfo.shutdown(suspend)
> [2011-04-11 12:58:18 2612] DEBUG (XendDomainInfo:1891)
> XendDomainInfo.handleShutdownWatch
> [2011-04-11 12:58:18 2612] DEBUG (XendDomainInfo:1891)
> XendDomainInfo.handleShutdownWatch
> [2011-04-11 12:58:18 2612] INFO (XendDomainInfo:2088) Domain has
> shutdown: name=migrating-test1 id=7 reason=suspend.
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:135) Domain 7 suspended.
> [2011-04-11 12:58:18 2612] INFO (image:538) signalDeviceModel:restore dm
> state to running
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) SUSPEND shinfo 00001d11
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) delta 260ms, dom0
> 3%, target 1%, sent 2Mb/s, dirtied 43Mb/s 343 pages
> [2011-04-11 12:58:18 2612] DEBUG (XendCheckpoint:144) Written done
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) Saving memory
> pages: iter 5   0%^M 5: sent 343, skipped 0, delta 10ms, dom0 0%, target
> 0%, sent 1123Mb/s, dirtied 1123Mb/s 343 pages
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) Total pages sent=
> 267144 (0.25x)
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) (of which 0 were
> fixups)
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) All memory is saved
> [2011-04-11 12:58:18 2612] INFO (XendCheckpoint:423) Save exit rc=0
> [2011-04-11 12:58:18 2612] DEBUG (XendDomainInfo:3053)
> XendDomainInfo.destroy: domid=7
> [2011-04-11 12:58:18 2612] DEBUG (XendDomainInfo:2411) Destroying device
> model
> [2011-04-11 12:58:18 2612] INFO (image:615) migrating-test1 device model
> terminated
> [2011-04-11 12:58:18 2612] DEBUG (XendDomainInfo:2418) Releasing devices
> [2011-04-11 12:58:18 2612] DEBUG (XendDomainInfo:2424) Removing vif/0
>
>
>
> The migration fails somewhere halfway through, since the receiving host
> xen01 has a domU test1 after the procedure, but this domU doesn't respond.
> Thats the log output of xen01:
>
>
> [2011-04-11 12:58:05 2550] INFO (image:822) Need to create platform
> device.[domid:28]
> [2011-04-11 12:58:05 2550] DEBUG (XendCheckpoint:286)
> restore:shadow=0x9, _static_max=0x40000000, _static_min=0x0,
> [2011-04-11 12:58:05 2550] DEBUG (XendCheckpoint:305) [xc_restore]:
> /usr/lib/xen/bin/xc_restore 27 28 2 3 1 1 1 0
> [2011-04-11 12:58:05 2550] INFO (XendCheckpoint:423) xc_domain_restore
> start: p2m_size = 100000
> [2011-04-11 12:58:05 2550] INFO (XendCheckpoint:423) Reloading memory
> pages:   0%
> [2011-04-11 12:58:18 2550] INFO (XendCheckpoint:423) Read 5792 bytes of
> QEMU data
> [2011-04-11 12:58:18 2550] INFO (XendCheckpoint:423) Read 888 bytes of
> QEMU data
> [2011-04-11 12:58:18 2550] INFO (XendCheckpoint:423) ERROR Internal
> error: Error when reading batch size
> [2011-04-11 12:58:18 2550] INFO (XendCheckpoint:423) ERROR Internal
> error: error when buffering batch, finishing
> [2011-04-11 12:58:18 2550] INFO (XendCheckpoint:423)
> [2011-04-11 12:58:18 2550] INFO (XendCheckpoint:423) Writing 6680 bytes
> of QEMU data
> [2011-04-11 12:58:18 2550] INFO (XendCheckpoint:423) Restore exit with rc=0
> [2011-04-11 12:58:18 2550] DEBUG (XendCheckpoint:394) store-mfn 1044476
> [2011-04-11 12:58:18 2550] DEBUG (XendDomainInfo:2992)
> XendDomainInfo.completeRestore
>
>
>
> Since I did not read about other people having this issue, I didn't file
> a bug report right away. Does somebody have an advice?
>

Can you reproduce this on a newer version of Xen (4.1 or xen-unstable)?

Thanks,
Todd

> Thanks,
> Mark
>
> [1]
> http://bderzhavets.wordpress.com/2010/06/02/setup-libvirt-0-8-0-xen-4-0-on-top-of-ubuntu-10-04-server-via-daniel-baumann-virtualization-ppa/


-- 
Todd Deshane
http://www.linkedin.com/in/deshantm
http://www.xen.org/products/cloudxen.html

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.