[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Crash *always* after approximately 1024 migrations



Hi,

We are running Xen 3 guests on NFS. One of our stress tests involves repeated live migration of a domU between 2 Xen hosts. Here are the basic steps:

* start domU on hostA
* start script on hostB that will live migrate to hostA once it sees
  running domU
* start script on hostA that will live migrate to hostB once it sees
  running domU (this starts the sequence)

The test always fails at close to 1024 migrations away (i.e. guest itself would have been migrated 2048 times). I say close because it never seems to quite get to 1024 and earlier runs crashed at 1018, 1020 and 1023 ... tonight's run crashed at 1021.

The domU can be under heavy load or no load, it makes no difference.

One of the hosts will freeze solid, the guest will be migrated to other host but still in paused state.

Name                              ID Mem(MiB) VCPUs State  Time(s)
Domain-0                           0     1000     1 r----- 11728.2
toxenc03                         1022      512     1 --p---     0.0

I am hoping this is a known problem. If it is not I will gather more information before and after each migration, document exact sequence of events and repeat twice to ensure some consistency. If there are any special points of interest that I should gather please advise.

We are using BETA of both SLES10 and FC5. Kernels have been both vendor supplied and custom compiled, behaviour the same across the board. Servers pairs were either 2x HP-DL380G4 or 1x IBM-x366 and 1x HP-DL380G4.

Thanks

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.