[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] remus failure -xen 4.0.1: xc_restore failed only at some heavy workload
I have done some experiments with remus and had some problems with its failover.
I set up dormO, and dormU like below and backup server is setup as same as primary. Ubuntu 9.10 Xen 4.0.1-rc2 kernel for dorm0 : 2.6.32.18 kernel for dormU : 2.6.18.8 with idle guest running on dorm0, I run remus on primary server, and destroy guest or remus, remus failover works and guest from primary server moves to backup server. but for some workload experiment, I run specweb or kernel compile on the guest and primary server runs remus. when the guest is destroyed or remus is killed, it doesn't survive at backup server even though it is checkpointing before. there was 'p' state of guest at backup server while checkpointing, but it's disappeared. Error in xend.log at backup server shows this message. ---- [XXXX-XX-XX 13:56:50 6038] ERROR (XendCheckpoint:357) /usr/lib/xen/bin/xc_restore 36 92 1 2 0 0 0 0 failed Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 309, in restore forkHelper(cmd, fd, handler.handler, True) File "/usr/lib/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 411, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib/xen/bin/xc_restore 36 92 1 2 0 0 0 0 failed [XXXX-XX-XX 13:56:50 6038] ERROR (XendDomain:1175) Restore failed Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/xen/xend/XendDomain.py", line 1159, in domain_restore_fd dominfo = XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating) File "/usr/lib/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 358, in restore raise exn XendError: /usr/lib/xen/bin/xc_restore 36 92 1 2 0 0 0 0 failed ---- it looks quite same with previous question from Shriram Rajagopalan http://lists.xensource.com/archives/html/xen-devel/2010-09/msg00369.html and this error seems appeared in xen live migration in the past, since remus shares functions with live migration, and error showed at xen live migration function. anyone has previous similar experience either with remus or xen live migration? anyone found any reason or solution for this? I will appreciate it if anyone can help with this. Thank you. Kyungjin. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |