[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Live migration fails when available memory exactly equal to required memory on target system
In diagnosing live migration failures (with 3.0.testing), I have noticed that a common failure is a lack of resources on the target system _and_ that this only seems to happen when the available resources at the time of the migration are exactly what is required for the VM being migrated. For example, here's a xend.log extract from a failed case: [2006-07-17 14:38:56 xend] DEBUG (balloon:128) Balloon: free 265; need 265; done. [2006-07-17 14:38:56 xend] DEBUG (XendCheckpoint:148) [xc_restore]: /usr/lib/xen/bin/xc_restore 10 4 112 67584 1 2 [2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) xc_linux_restore start: max_pfn = 10800 [2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) Failed allocation for dom 112: 67584 pages order 0 addr_bits 0 [2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) Failed to increase reservation by 42000 KB: 12 [2006-07-17 14:38:57 xend] ERROR (XendCheckpoint:242) Restore exit with rc=1 The nr_pfns parameter to xc_restore shows that we need 264MB - balloon.py added a slop of 1MB to that to come up with the 265 number. Immediately following this failed attempt, I tried again: [2006-07-17 14:38:58 xend] DEBUG (balloon:134) Balloon: free 264; need 265; retries: 10. [2006-07-17 14:38:58 xend] DEBUG (balloon:143) Balloon: setting dom0 target to 1235. [2006-07-17 14:38:58 xend.XendDomainInfo] DEBUG (XendDomainInfo:945) Setting memory target of domain Domain-0 (0) to 1235 MiB. [2006-07-17 14:38:58 xend] DEBUG (balloon:128) Balloon: free 265; need 265; done. [2006-07-17 14:38:58 xend] DEBUG (XendCheckpoint:148) [xc_restore]: /usr/lib/xen/bin/xc_restore 10 4 113 67584 1 2 [2006-07-17 14:38:59 xend] ERROR (XendCheckpoint:242) xc_linux_restore start: max_pfn = 10800 [2006-07-17 14:38:59 xend] ERROR (XendCheckpoint:242) Increased domain reservation by 42000 KB This time, we can see that there was only 264MB free so we had to kick the balloon driver to free up 1MB - once this was done (and we had exactly 265MB free again), we were able to increase the reservation for the target DomU to the requested amount... The above is fairly reproducible but I'm not sure where to go next to figure out where the issue really is (or, indeed, if there really is an issue -- maybe this is just one of those inherently racy things; however, I find it odd that it only seems to happen when the initial free is exactly the same as the desired; I have plenty of other cases where there is way more and way less memory available all of which seem to work just fine). Any suggestions? /simgr _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |