[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen/stable-2.6.32.x xen-4.1.1 live migration fails with kernels 2.6.39, 3.0.3 and 3.1-rc2



On 09/08/2011 09:50 PM, Konrad Rzeszutek Wilk wrote:
On Thu, Sep 08, 2011 at 02:12:27PM -0400, Konrad Rzeszutek Wilk wrote:
On Thu, Sep 08, 2011 at 01:32:12PM -0400, Konrad Rzeszutek Wilk wrote:
On Wed, Sep 07, 2011 at 09:50:47AM -0400, Konrad Rzeszutek Wilk wrote:
On Wed, Aug 31, 2011 at 03:07:22PM +0200, Andreas Olsowski wrote:
A little update, i now have all machines running on xen-4.1-testing
with xen/stable-2.6.32.x
That gave me the possiblity for additional tests.

(I also tested xm/xend in addtion to xl/libxl, to make sure its not
a xl/libxl problem.)

I took the liberty to create a new test result matrix that should
provide a better overview (in case someone else wants to get the
whole picture):

So.. I don't think the issue I am seeing is exactly the same. This is
what 'xl' gives me:

Scratch that. I am seeing the error below if I:

1) Create guest on 4GB machine
2) Migrate it to the 32GB box (guest still works)
3) Migrate it to the 4GB box (guest dies - error below shows up and
guest is dead).

With 3.1-rc5 virgin - both Dom0 and DomU. Also Xen 4.1-testing on top of this.

I tried just creating a guest on the 32GB and migrating it - and while
it did migrate it was stuck in a hypercall_page call or crashed later on.

Andreas,

Thanks for reporting this.

Oh wait. At some point you said that 2.6.32.43 worked for you.. Is that still
the case?
>
(Ignore e-mail from a few minutes ago, accidentally did not reply-all)

Did I? I will have to check my sent emails, but im pretty sure that if i found a way that works i normally would use it.

But i can try an older version later today.

Btw. allthough you get the same error as i do, the circumstances are slightly different.

This does not neccessarily have sth to todo with the amount of memory.
I do see this on hosts where both have the same amount of ram but are a different hardware platform.


Can you please try one thing for me - can you make sure the boxes have exact 
same
amount of memory? You can do 'mem=X' on the Xen hypervisor line to set that.
Running mem=8g and have turned balooning dom0 off.

        multiboot       /boot/xen.gz placeholder dom0_mem=8192M
module /boot/vmlinuz-2.6.32.45-xen0 placeholder root=UUID=216ff902-b505-45c4-9bcb-9d63b4cb8992 ro mem=8G nomodeset console=tty0 console=ttyS1,57600 earlyprintk=xen


For some reason though, the two r610s show:
root@netcatarina:~# cat /proc/meminfo
MemTotal:        8378236 kB
root@netcatarina:~#  xl list |grep Domain-0
Domain-0 0 7445 8 r----- 124304.7

root@memoryana:~# cat /proc/meminfo
MemTotal:        8378236 kB
root@memoryana:~# xl list |grep Domain-0
Domain-0 0 7445 8 r----- 132125.0

wheras the r710:
root@tarballerina:~# cat /proc/meminfo
MemTotal:        7886716 kB
root@tarballerina:~#  xl list |grep Domain-0
Domain-0 0 7221 8 r----- 64497.0

On a sidenote:

root@tarballerina:~# xl mem-set Domain-0 8192
libxl: error: libxl.c:2119:libxl_set_memory_target cannot get memory info from /local/domain/0/memory/static-max
: No such file or directory

The two r610s can xl set their memory just fine


I think the problem you are running into is that you are migrating between
different CPU families... Is the /proc/cpuinfo drastically different between
the boxes?
diff:
< model              : 26
< model name : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
< stepping   : 5
< cpu MHz            : 2261.074
< cache size : 8192 KB
---
> model              : 44
> model name : Intel(R) Xeon(R) CPU           E5640  @ 2.67GHz
> stepping   : 2
> cpu MHz            : 2660.050
> cache size : 12288 KB
13,14c13,14
< flags : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf pni est ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm ida
< bogomips   : 4522.14
---
> flags : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall lm constant_tsc rep_good nonstop_tsc aperfmperf pni pclmulqdq est ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm ida arat
> bogomips   : 5320.10

diffrent flags are: nx and aes

And thats r610 and r710. The cpu in the 2950 is older, a completely different platform, different chipset, no on-chip memory controller.

--
Andreas Olsowski
Leuphana Universität Lüneburg
Rechen- und Medienzentrum
Scharnhorststraße 1, C7.015
21335 Lüneburg

Tel: ++49 4131 677 1309

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.