[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] "xm save" only works once...
Hi, I am using Xen-2.0.7 on a Dual Intel Xeon 2.8GHz system with 4GB of ram. I am using 2.6.11 as kernel for my domain 0. Domain 0 uses Debian Sarge with a backported Xen 2.0.7 package (only litte changes to the debian 2.0.6 package, nothing important enough to get metioned). All kernels were compiled against vanilla kernels with xen-patch. The domain U's are using 2.6.11 or 2.4.30 (debian, suse). I have no problems within domains and everything is running very smoothly, exepct one thing (which was also not working correctly in xen-2.0.6 for me): I can save a domain with "xm save <domainname> <suspendfile>" once and I can restore this domain again, but if I try a second "xm save ..." it simply seems to hang. Nothing happens and the last thing in the logs are these lines: ==> /var/log/xend.log <== [2005-08-15 20:12:27 xend] INFO (XendMigrate:380) Save BEGIN: ['save', ['id', '1'], ['state', 'begin'], ['domain', '5'], ['file', '/suspend/vm-ralph']] [2005-08-15 20:12:27 xend] INFO (XendRoot:113) EVENT> xend.domain.save ['vm-ralph', '5', 'begin', ['save', ['id', '1'], ['state', 'begin'], ['domain', '5'], ['file', '/suspend/vm-ralph']]] ==> /var/log/xfrd.log <== 3808 [INF] XFRD> Accepted connection from 127.0.0.1:3905 on 2 4165 [INF] XFRD> Xfr service for 127.0.0.1:3905 [DEBUG] Conn_init> flags=1 [DEBUG] Conn_init> write stream... [DEBUG] stream_init>mode=w flags=1 compress=0 [DEBUG] stream_init> unbuffer... [DEBUG] stream_init< err=0 [DEBUG] Conn_init> read stream... [DEBUG] stream_init>mode=r flags=1 compress=0 [DEBUG] stream_init> unbuffer... [DEBUG] stream_init< err=0 [DEBUG] Conn_sxpr> (xfr.hello 1 0)[DEBUG] Conn_sxpr< err=0 [DEBUG] Conn_sxpr> (xfr.save 5 "(domain (id 5) (name vm-ralph) (memory 127) (maxmem 128) (state -b---) (cpu 3) (cpu_time 1.583158713) (up_time 1401.25794005) (start_time 1124128146.12) (console (status listening) (id 12) (domain 5) (local_port 12) (remote_port 1) (console_port 9605)) (devices (vif (idx 0) (vif 0) (mac aa:00:00:00:00:22) (vifname vif5.0) (ip 212.79.XXX.XXX/32) (evtchn 17 4) (index 0)) (vbd (idx 0) (vdev 2049) (device 65030) (mode w) (dev sda1) (uname phy:xen-volumes/vm-ralph) (node xen-volumes/vm-ralph) (index 0)) (vbd (idx 1) (vdev 2050) (device 65031) (mode w) (dev sda2) (uname phy:xen-volumes/swap-ralph) (node xen-volumes/swap-ralph) (index 1))) (config (vm (name vm-ralph) (memory 128) (cpu 3) (image (linux (kernel /boot/xen-linux-2.6.11-domu-tops1) (ramdisk /boot/xen-linux-2.6.11-domu-tops1-modules) (root '/dev/sda1 ro'))) (device (vbd (uname phy:xen-volumes/vm-ralph) (dev sda1) (mode w))) (device (vbd (uname phy:xen-volumes/swap-ralph) (dev sda2) (mode w))) (device (vif (mac aa:00:00:00:00:22) (ip 212.79.XXX.XXX/32))))))" /suspend/vm-ralph) [DEBUG] Conn_sxpr< err=0 [1124129547.387983] xc_linux_save start 5 xc_linux_save start 5 I can strace the "xm save" process, but there is not much acction: xen:/var/log# ps fax |grep xm 4164 pts/0 S+ 0:00 | \_ python /usr/sbin/xm save vm-ralph /suspend/vm-ralph xen:/var/log# strace -p 4164 Process 4164 attached - interrupt to quit recv(3, Even an xfrd thrad seems to be spawned, but there is more or less the same as in the xm save process: xen:/var/log# ps fax |grep xfrd 3808 ? S 0:00 xfrd 4165 ? SL 0:00 \_ xfrd xen:/var/log# strace -p 4165 Process 4165 attached - interrupt to quit read(3, I can press ctrl-c and the "xm save" aborts with the following error (I waited over 3min): Traceback (most recent call last): File "/usr/sbin/xm", line 9, in ? main.main(sys.argv) File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 808, in main xm.main(args) File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 106, in main self.main_call(args) File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 124, in main_call p.main(args[1:]) File "/usr/lib/python2.3/site-packages/xen/xm/main.py", line 276, in main server.xend_domain_save(dom, savefile) File "/usr/lib/python2.3/site-packages/xen/xend/XendClient.py", line 244, in xend_domain_save {'op' : 'save', File "/usr/lib/python2.3/site-packages/xen/xend/XendClient.py", line 148, in xendPost return self.client.xendPost(url, data) File "/usr/lib/python2.3/site-packages/xen/xend/XendProtocol.py", line 79, in xendPost return self.xendRequest(url, "POST", args) File "/usr/lib/python2.3/site-packages/xen/xend/XendProtocol.py", line 143, in xendRequest resp = conn.getresponse() File "/usr/lib/python2.3/httplib.py", line 781, in getresponse response.begin() File "/usr/lib/python2.3/httplib.py", line 273, in begin version, status, reason = self._read_status() File "/usr/lib/python2.3/httplib.py", line 231, in _read_status line = self.fp.readline() File "/usr/lib/python2.3/socket.py", line 323, in readline data = recv(1) KeyboardInterrupt After that it doesn't matter if I shutdown and recreate the domain before I try to save the domain for the second time. It happens every time after the first successfull save&restore. Sometimes even on the first "xm save" attempt. It even seems that xen let's the "half-saved" domain in a broken state, because I cannot shutdown the domain correctly after the second "xm save" attempt. I can ssh into it and type "halt" and it shutdowns, but xen (xm list) still things that the domain is running. even a xm destroy <domainname> doesn't help. I have to reboot the phy. machine to get the domain working correctly. Because this should get a production system very soon I would appreciate help very much. More information (like xm dmesg) available on request... ;-PP --Ralph _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |