[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Live migration fails when source machine has multiple domUs



Hi list,

I seem to have encountered a bug that's been reported a few times on this list but there's no bug in the bugzilla and no one seems to have reported a resolution.

I have a three node RHEL cluster running some paravirtualised virtual machines, each using a CLVM logical volume block device as their storage. There's no cluster file systems involved and the block device for each virtual machine is accessible on all three dom0 servers.

All dom0 and all domU are x86_64 RHEL 5.2 (also tried CentOS 5.2).

Live migration works perfectly when there's only one virtual machine involved. However, if two virtual machines are running on one server and I try to migrate one away to another server, xend starts to migrate the state (copies all the memory, etc) and then I get this error on the domU console:

WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
netif_release_rx_bufs: 0 xfer, 62 noxfer, 194 unused
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!


Apologies for the long email, but I'll also include below the xend.log output from the source dom0 server. I've seen this before on the list and it always relates to network-based shared storage, whether that's iSCSI, DRBD or GNBD (my case). As far as I can tell, the migration works fine and the VM's state transfers completely but then has a problem trying to relinquish device 51712 (which is the xvda disk). The 'exception looking up device number for xvda' also has me suspicious.

Any help is much appreciated!

Regards,
Tom

xend.log output follows:

[2008-08-24 00:27:43 xend 5252] DEBUG (balloon:127) Balloon: 26652 KiB free; need 25600; done. [2008-08-24 00:27:43 xend 5252] DEBUG (XendCheckpoint:89) [xc_save]: / usr/lib64/xen/bin/xc_save 22 9 0 0 1 [2008-08-24 00:27:43 xend 5252] INFO (XendCheckpoint:351) ERROR Internal error: Couldn't enable shadow mode
[2008-08-24 00:27:43 xend 5252] INFO (XendCheckpoint:351) Save exit rc=1
[2008-08-24 00:27:43 xend 5252] ERROR (XendCheckpoint:133) Save failed on domain nodea (9).
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/xend/ XendCheckpoint.py", line 110, in save
   forkHelper(cmd, fd, saveInputHandler, False)
File "/usr/lib64/python2.4/site-packages/xen/xend/ XendCheckpoint.py", line 339, in forkHelper
   raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib64/xen/bin/xc_save 22 9 0 0 1 failed
[2008-08-24 00:27:43 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 1601) XendDomainInfo.resumeDomain(9) [2008-08-24 00:27:43 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 1614) XendDomainInfo.resumeDomain: devices released [2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 791) Storing domain details: {'console/ring-ref': '2057005', 'console/ port': '2', 'name': 'migrating-nodea', 'console/limit': '1048576', 'vm': '/vm/b845f914-33a3-e1cf-551e-01b6d346b92b', 'domid': '9', 'cpu/0/ availability': 'online', 'memory/target': '6144000', 'store/ring-ref': '2049294', 'store/port': '1'} [2008-08-24 00:27:44 xend 5252] DEBUG (DevController:110) DevController: writing {'backend-id': '0', 'mac': '00:16:3e:6c:ae:9f', 'handle': '0', 'state': '1', 'backend': '/local/domain/0/backend/vif/ 9/0'} to /local/domain/9/device/vif/0. [2008-08-24 00:27:44 xend 5252] DEBUG (DevController:112) DevController: writing {'bridge': 'br102', 'domain': 'migrating- nodea', 'handle': '0', 'script': '/etc/xen/scripts/vif-bridge', 'state': '1', 'frontend': '/local/domain/9/device/vif/0', 'mac': '00:16:3e:6c:ae:9f', 'online': '1', 'frontend-id': '9'} to /local/ domain/0/backend/vif/9/0. [2008-08-24 00:27:44 xend 5252] DEBUG (blkif:24) exception looking up device number for xvda: [Errno 2] No such file or directory: '/dev/xvda' [2008-08-24 00:27:44 xend 5252] DEBUG (DevController:110) DevController: writing {'backend-id': '0', 'virtual-device': '51712', 'device-type': 'disk', 'state': '1', 'backend': '/local/domain/0/ backend/vbd/9/51712'} to /local/domain/9/device/vbd/51712. [2008-08-24 00:27:44 xend 5252] DEBUG (DevController:112) DevController: writing {'domain': 'migrating-nodea', 'frontend': '/ local/domain/9/device/vbd/51712', 'format': 'raw', 'dev': 'xvda', 'state': '1','params': '/dev/int_vg/os_nodea', 'mode': 'w', 'online': '1', 'frontend-id': '9', 'type': 'phy'} to /local/domain/0/backend/vbd/ 9/51712. [2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 1626) XendDomainInfo.resumeDomain: devices created [2008-08-24 00:27:44 xend.XendDomainInfo 5252] ERROR (XendDomainInfo: 1631) XendDomainInfo.resume: xc.domain_resume failed on domain 9.
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/xend/ XendDomainInfo.py", line 1628, in resumeDomain
   xc.domain_resume(self.domid, fast)
Error: (1, 'Internal error', "Couldn't map start_info")
[2008-08-24 00:27:44 xend 5252] DEBUG (XendCheckpoint:136) XendCheckpoint.save: resumeDomain [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping... [2008-08-24 00:27:45 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1722) Dev 51712 still active, looping...
-------many repeats-------
[2008-08-24 00:28:14 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 1728) Dev still active but hit max loop timeout

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.