[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] soft lockups during live migrate..



On Fri, 23 Oct 2009 15:16:51 -0700
Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> wrote:

> On Fri, 23 Oct 2009 11:09:36 +0100
> Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
> 
> > At 05:21 +0100 on 23 Oct (1256275309), Mukesh Rathor wrote:
> > > Trying to migrate a 64bit PV guest with 64GB running medium to
> > > heavy load on xen 3.4.0, it is showing lot of soft lockups. The
> > > softlockups are causing dom0 reboot by the cluster FS. The
> > > hardware has 256GB and 32 CPUs.
> > > 
> > > Looking into the hypervisor thru kdb, I see one cpu in
> > > sh_resync_all() while all other 31 appear spinning on the
> > > shadow_lock.
> > 
> > How many vcpus does the guest have?  Scalability issues in the OOS
> > shadow code are more related to number of VCPUs than amount of RAM.
> 
> Actually, things are fine with 32GB/32vcpus. Problem happens with
> 64GB/32vcpus. Trying the unstable version now.

Nah, with c/s 20365 and oos=0 in vm.cfg, it fails right away:

[root@OVM_EL5U3_X86_64_PVM_4GB]# xm migrate -l 3 vega7183
Error: /usr/lib/xen/bin/xc_save 82 3 0 0 1 failed


On source xend.log:

[2009-09-23 16:22:33 16993] DEBUG (balloon:181) Balloon: 199147540 KiB free; 
need 16384; done.
[2009-09-23 16:22:34 16993] DEBUG (XendCheckpoint:110) [xc_save]: 
/usr/lib/xen/bin/xc_save 82 3 0 0 1
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) xc_save: failed to get 
the suspend evtchn port
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) 
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) ERROR Internal error: 
xc_get_m2p_mfns
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) ERROR Internal error: 
Failed to map live M2P table
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) Save exit rc=1
[2009-09-23 16:22:34 16993] ERROR (XendCheckpoint:164) Save failed on domain 
OVM_EL5U3_X86_64_PVM_4GB (3) - resuming.


on TARGET looks pretty screwy:

domain', ['domid', '3'], ['on_crash', 'restart'], ['uuid', 
'b990db11-57f4-a553-5ee0-c022234f3dd5'], ['bootloader_args', '-q'], ['vcpus', 
'32'], ['name', 'OVM_EL5U3_X86_64_PVM_4GB'], ['on_poweroff', 'destroy'], 
['on_reboot', 'restart'], ['cpus', [['0', '1', '2', '3', '4', '5', '6', '7', 
'8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', 
'21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31'], ['0', '1', 
'2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 
'16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', 
'29', '30', '31'], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', 
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', 
'24', '25', '26', '27', '28', '29', '30', '31'], ['0', '1', '2', '3', '4', '5', 
'6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', 
'20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31'], ['0', 
'!
 
 1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', 
'15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', 
'28', '29', '30', '31'], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 
'10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', 
'23', '24', '25', '26', '27', '28', '29', '30', '31'], ['0', '1', '2', '3', 
'4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', 
'18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', 
'31'], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', 
'13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', 
'26', '27', '28', '29', '30', '31'] .......
..........

[2009-10-23 16:21:15 11795] DEBUG (image:319) No VNC passwd configured for vfb 
access
[2009-10-23 16:21:15 11795] DEBUG (XendCheckpoint:261) restore:shadow=0x0, 
_static_max=0xfa0000000, _static_min=0x0,
[2009-10-23 16:21:15 11795] DEBUG (balloon:181) Balloon: 264942044 KiB free; 
need 65536000; done.
[2009-10-23 16:21:15 11795] DEBUG (XendCheckpoint:278) [xc_restore]: 
/usr/lib/xen/bin/xc_restore 4 3 1 2 0 0 0
[2009-10-23 16:21:15 11795] INFO (XendCheckpoint:418) ERROR Internal error: 
read: p2m_size
[2009-10-23 16:21:15 11795] INFO (XendCheckpoint:418) Restore exit with rc=1
[2009-10-23 16:21:15 11795] DEBUG (XendDomainInfo:2748) XendDomainInfo.destroy: 
domid=3
[2009-10-23 16:21:15 11795] ERROR (XendDomainInfo:2762) XendDomainInfo.destroy: 
domain destruction failed.
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 
2755, in destroy







_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.