[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] save/restore/save hangs



Hi,

I have xen 3.4.2 + 2.6.31.5 pv_ops kernel from 
http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=xen/master
 
running in dom0 and domU. Both dom0 and domU are 64bit in kernels and 
userlands.

-----8<----
# xm info
host                   : host1
release                : 2.6.31.5x_xen0nogrsecurity-BL5.3
version                : #1 SMP Thu Nov 12 02:43:22 UTC 2009
machine                : x86_64
nr_cpus                : 2
nr_nodes               : 1
cores_per_socket       : 1
threads_per_core       : 1
cpu_mhz                : 2192
hw_caps                : 
078bf3ff:e1d3fbff:00000000:00000010:00000000:00000000:00000000:00000000
virt_caps              :
total_memory           : 4031
free_memory            : 2052
node_to_cpu            : node0:0-1
node_to_memory         : node0:2052
xen_major              : 3
xen_minor              : 4
xen_extra              : .2
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
cc_compiler            : gcc version 4.4.2 20091026 (release) (PLD-Linux)
cc_compile_by          :
cc_compile_domain      : ics.p.lodz.pl
cc_compile_date        : Thu Nov 12 09:31:48 UTC 2009
xend_config_format     : 4
-----8<----

When I do save and restore then subsequent save hangs domU:

-----8<----
[root@host1]# xm create /etc/xen/machines/guest4.xen
Using config file "/etc/xen/machines/guest4.xen".
Started domain guets4 (id=3)
[root@host1]# xm list
Name                                        ID   Mem VCPUs      State   
Time(s)
Domain-0                                     0  3728     2     r-----    688.0
guest4                                       3   256     1     -b----      4.1
[root@host1]# xm save 3 /var/tmp/guest4-3.xen-save
[root@host1]# ls -l /var/tmp/*save
-rwxr-xr-x 1 root root 268977193 11-13 00:50 /var/tmp/guest4-3.xen-save
[root@host1]# xm restore /var/tmp/guest4-3.xen-save
[root@host1]# xm list
Name                                        ID   Mem VCPUs      State   
Time(s)
Domain-0                                     0  3728     2     r-----    701.5
guest4                                       4   256     1     r-----     45.6
-----8<----
[me@workstation]$ ssh me@guest4
Last login: Fri Nov 13 00:33:07 2009 from grey.cluster.turystyka.com.pl
[me@guest4]$ 
-----8<----

Now, I try to save the guest again, but only 1,5k data is saved and the guest 
hangs:

-----8<----
[root@host1]# xm save 4 /var/tmp/guest4-4.xen-save
Error: /usr/lib64/xen/bin/xc_save 52 4 0 0 0 failed
Usage: xm save [-c] <Domain> <CheckpointFile>

Save a domain state to restore later.
  -c, --checkpoint               Leave domain running after creating
                                 snapshot

[root@host1]# xm list
Name                                        ID   Mem VCPUs      State   
Time(s)
Domain-0                                     0  3728     2     r-----    704.3
guest4                                       4   256     1     -b----    208.0
[root@host1]# ls -l /var/tmp/*save
-rwxr-xr-x 1 root root 268977193 11-13 00:50 /var/tmp/guest4-3.xen-save
-rwxr-xr-x 1 root root      1558 11-13 00:56 /var/tmp/guest4-4.xen-save
[root@host1]#
-----8<----

I can login to guest neither from console nor from ssh. The only thing I can 
do is to destroy it. By the way dom0 seems to be robust: guest save/restore 
problems have no impact on dom0 stability.

You can guess how migrate works:

[root@host1]# xm create /etc/xen/machines/guest4.xen
Using config file "/etc/xen/machines/guest4.xen".
Started domain guest4 (id=1)
[root@host1]# xm list
Name                                        ID   Mem VCPUs      State   
Time(s)
Domain-0                                     0  3728     2     r-----    666.4
guest4                                       1   256     1     -b----      4.1
[root@host1]# xm migrate 1 host2
[root@host1]#
-----8<----
[root@host2]# xm list
Name                                        ID   Mem VCPUs      State   
Time(s)
Domain-0                                     0   909     2     r-----    740.0
guest4                                       5   256     1     r-----     27.4
[root@host2]# xm console 5
guest4 login: 
-----8<----
[me@workstation]$ ssh me@guest4
Last login: Thu Nov 12 16:23:13 2009 from xxxx
[me@guest4]$ 
-----8<----

Looks excellent until this point, but when I try to migrate guest4 back to 
host1 I get:

[root@host2]# xm migrate 5 host1
Error: /usr/lib64/xen/bin/xc_save 52 5 0 0 0 failed
Usage: xm migrate <Domain> <Host>
[---]

-----8<----
[me@workstation]$ ssh me@guest4
ssh: connect to host guest4 port 22: No route to host
[me@workstation]$
-----8<----

These scenarios are 100% reproducible.

When domU hangs during save I get in xend log:

-----8<----
[2009-11-13 00:56:54 3695] DEBUG (XendCheckpoint:110) 
[xc_save]: /usr/lib64/xen/bin/xc_save 52 4 0 0 0
[2009-11-13 00:56:54 3695] INFO (XendCheckpoint:417) xc_save: failed to get 
the suspend evtchn port
[2009-11-13 00:56:54 3695] DEBUG (XendCheckpoint:388) suspend
[2009-11-13 00:56:54 3695] INFO (XendCheckpoint:417)
[2009-11-13 00:56:54 3695] DEBUG (XendCheckpoint:113) In saveInputHandler 
suspend
[2009-11-13 00:56:54 3695] DEBUG (XendCheckpoint:115) Suspending 4 ...
[2009-11-13 00:56:54 3695] DEBUG (XendDomainInfo:511) 
XendDomainInfo.shutdown(suspend)
[2009-11-13 00:56:54 3695] DEBUG (XendDomainInfo:1709) 
XendDomainInfo.handleShutdownWatch
[2009-11-13 00:56:54 3695] DEBUG (XendDomainInfo:1709) 
XendDomainInfo.handleShutdownWatch
[2009-11-13 00:56:54 3695] INFO (XendDomainInfo:1903) Domain has shutdown: 
name=migrating-test44 id=4 reason=suspend.
[2009-11-13 00:56:54 3695] INFO (XendCheckpoint:121) Domain 4 suspended.
[2009-11-13 00:56:54 3695] INFO (XendCheckpoint:417) ERROR Internal error: 
Frame# in pfn-to-mfn frame list is not in pseudophys
[2009-11-13 00:56:54 3695] INFO (XendCheckpoint:417) ERROR Internal error: 
entry 0: p2m_frame_list[0] is 0x2033343af6353a30, max 0xfbff0
[2009-11-13 00:56:55 3695] INFO (XendCheckpoint:417) ERROR Internal error: 
Failed to map/save the p2m frame list
[2009-11-13 00:56:55 3695] INFO (XendCheckpoint:417) Save exit rc=1
[2009-11-13 00:56:55 3695] DEBUG (XendCheckpoint:130) Written done
[2009-11-13 00:56:55 3695] ERROR (XendCheckpoint:164) Save failed on domain 
test44 (4) - resuming.
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 
132, in save
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 
405, in forkHelper
XendError: /usr/lib64/xen/bin/xc_save 52 4 0 0 0 failed
[2009-11-13 00:56:55 3695] DEBUG (XendDomainInfo:2788) 
XendDomainInfo.resumeDomain(4)
[2009-11-13 00:56:55 3695] DEBUG (XendDomainInfo:2829) 
XendDomainInfo.resumeDomain: completed
-----8<----

Regards,

-- 
Bartosz Lis @ Inst. of Information Technology, Technical Univ. of Lodz
   bartoszl @ ics.p.lodz.pl

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.