[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Re: [Xen-users] Live migration problem


  • To: <xen-users@xxxxxxxxxxxxxxxxxxx>
  • From: "Cole, Ray" <Ray_Cole@xxxxxxx>
  • Date: Wed, 24 Aug 2005 18:01:32 -0500
  • Delivery-date: Wed, 24 Aug 2005 22:59:46 +0000
  • List-id: Xen user discussion <xen-users.lists.xensource.com>
  • Thread-index: AcWoyWDtMCeV8k9BQ1akWlkHp16ukQAMIikQAAE2KsA=
  • Thread-topic: Re: [Xen-users] Live migration problem

One thing I don't understand is when I look at xc_linux_save.c's 
suspend_and_state function - it appears it does:

  xcio_suspend_domain(ioctxt);

retry:
 
  ... stuff tries to see if xcio_suspend_domain worked - is that correct?

Should the xcio_suspend_domain() call be after the retry: label?  Or does that 
xcio_suspend_domain call guarantee the message is delivered and the code under 
retry: is just waiting for the state to now change?

Second, I see it tries 100 times with a 10,000microsecond sleep inbetween.  So 
it only waits for 1 second for the domain to suspend.  I realize 1 second is a 
really long time in terms of computing.  But I'm wondering what all conditions 
must be true for it to be able to suspend a domain.  Could there legitimately 
be times when it would take longer than a second for suspension?

-- Ray

>  -----Original Message-----
> From:         Cole, Ray  
> Sent: Wednesday, August 24, 2005 5:33 PM
> To:   'xen-users@xxxxxxxxxxxxxxxxxxx'
> Subject:      RE: Re: [Xen-users] Live migration problem
> 
> This is a snippet from the originating host's xfrd.log when the failure 
> happened.  Meanwhile the other side's xfrd.log looked good until it got an 
> 'Error when reading from state file'.
> 
> 
> -- Ray
> 
> 
> -----
> 
> 
>  1: sent 45431, skipped 3721, 
> 
>  1: sent 45431, skipped 3721, delta 16818ms, dom0 23%, target 9%, sent 
> 88Mb/s, dirtied 8Mb/s 4557 pages
> [1124922400.612070] Saving memory pages: iter 2   0%
> Saving memory pages: iter 2   0% 22%
>  22% 47%
>  47% 71%
>  71%
>  2: sent 4054, skipped 500, 
> 
>  2: sent 4054, skipped 500, delta 1571ms, dom0 22%, target 21%, sent 84Mb/s, 
> dirtied 64Mb/s 3116 pages
> [1124922402.183740] Saving memory pages: iter 3   0%
> Saving memory pages: iter 3   0% 31%
>  31% 73%
>  73%
>  3: sent 2693, skipped 420, 
> 
>  3: sent 2693, skipped 420, delta 1022ms, dom0 23%, target 19%, sent 86Mb/s, 
> dirtied 65Mb/s 2053 pages
> [1124922403.205810] Saving memory pages: iter 4   0%
> Saving memory pages: iter 4   0% 48%
>  48%
>  4: sent 1874, skipped 176, 
> 
>  4: sent 1874, skipped 176, delta 705ms, dom0 23%, target 17%, sent 87Mb/s, 
> dirtied 61Mb/s 1315 pages
> [1124922403.911777] Saving memory pages: iter 5   0%
> Saving memory pages: iter 5   0% 87%
>  87%
>  5: sent 1107, skipped 201, 
> 
>  5: sent 1107, skipped 201, delta 470ms, dom0 21%, target 40%, sent 77Mb/s, 
> dirtied 128Mb/s 1846 pages
> [1124922404.382522] Saving memory pages: iter 6   0%
> Saving memory pages: iter 6   0% 63%
>  63%
>  6: sent 1491, skipped 349, 
> 
>  6: sent 1491, skipped 349, delta 609ms, dom0 22%, target 46%, sent 80Mb/s, 
> dirtied 142Mb/s 2647 pages
> [1124922404.992153] Saving memory pages: iter 7   0%
> Saving memory pages: iter 7   0% 38%
>  38% 79%
>  79%
>  7: sent 2348, skipped 295, 
> 
>  7: sent 2348, skipped 295, delta 890ms, dom0 23%, target 18%, sent 86Mb/s, 
> dirtied 102Mb/s 2797 pages
> [1124922405.882768] Saving memory pages: iter 8   0%
> Saving memory pages: iter 8   0% 38%
>  38% 84%
>  84%
>  8: sent 2409, skipped 384, 
> 
>  8: sent 2409, skipped 384, delta 860ms, dom0 24%, target 6%, sent 91Mb/s, 
> dirtied 27Mb/s 713 pages
> [1124922406.742903] Saving memory pages: iter 9   0%
> Saving memory pages: iter 9   0%
>  9: sent 624, skipped 83, 
> 
>  9: sent 624, skipped 83, delta 230ms, dom0 23%, target 9%, sent 88Mb/s, 
> dirtied 58Mb/s 410 pages
> [1124922406.973505] Saving memory pages: iter 10   0%
> Saving memory pages: iter 10   0%
>  10: sent 404, skipped 0, 
> 
>  10: sent 404, skipped 0, delta 142ms, dom0 26%, target 6%, sent 93Mb/s, 
> dirtied 51Mb/s 223 pages
> [1124922407.118014] Saving memory pages: iter 11   0%
> Saving memory pages: iter 11   0%
>  11: sent 127, skipped 89, 
> 
>  11: sent 127, skipped 89, delta 47ms, dom0 29%, target 6%, sent 88Mb/s, 
> dirtied 150Mb/s 216 pages
> [1124922407.163792] Saving memory pages: iter 12   0%> 
> Saving memory pages: iter 12   0%
>  12: sent 210, skipped 0, 
> 
>  12: sent 210, skipped 0, delta 78ms, dom0 25%, target 10%, sent 88Mb/s, 
> dirtied 132Mb/s 315 pages
> [1124922407.242383] Saving memory pages: iter 13   0%
> Saving memory pages: iter 13   0%
>  13: sent 309, skipped 0, 
> 
>  13: sent 309, skipped 0, delta 113ms, dom0 25%, target 7%, sent 89Mb/s, 
> dirtied 91Mb/s 317 pages
> [1124922407.355431] Saving memory pages: iter 14   0%
> Saving memory pages: iter 14   0%
>  14: sent 310, skipped 0, 
> 
>  14: sent 310, skipped 0, delta 113ms, dom0 25%, target 7%, sent 89Mb/s, 
> dirtied 82Mb/s 283 pages
> [1124922407.468703] Saving memory pages: iter 15   0%
> Saving memory pages: iter 15   0%
>  15: sent 277, skipped 0, 
> 
>  15: sent 277, skipped 0, delta 99ms, dom0 26%, target 7%, sent 91Mb/s, 
> dirtied 94Mb/s 287 pages
> [1124922407.568408] Saving memory pages: iter 16   0%
> Saving memory pages: iter 16   0%
>  16: sent 281, skipped 0, 
> 
>  16: sent 281, skipped 0, delta 102ms, dom0 26%, target 8%, sent 90Mb/s, 
> dirtied 107Mb/s 334 pages
> [1124922407.671120] Saving memory pages: iter 17   0%
> Saving memory pages: iter 17   0%
>  17: sent 243, skipped 86, 
> 
>  17: sent 243, skipped 86, delta 93ms, dom0 25%, target 13%, sent 85Mb/s, 
> dirtied 135Mb/s 385 pages
> [1124922407.764443] Saving memory pages: iter 18   0%
> Saving memory pages: iter 18   0%
>  18: sent 378, skipped 0, 
> 
>  18: sent 378, skipped 0, delta 144ms, dom0 24%, target 11%, sent 86Mb/s, 
> dirtied 82Mb/s 363 pages
> [1124922407.908636] Saving memory pages: iter 19   0%
> Saving memory pages: iter 19   0%
>  19: sent 355, skipped 0, 
> 
>  19: sent 355, skipped 0, delta 130ms, dom0 26%, target 6%, sent 89Mb/s, 
> dirtied 71Mb/s 283 pages
> [1124922408.038797] Saving memory pages: iter 20   0%
> Saving memory pages: iter 20   0%
>  20: sent 185, skipped 92, 
> 
>  20: sent 185, skipped 92, delta 106ms, dom0 17%, target 81%, sent 57Mb/s, 
> dirtied 218Mb/s 707 pages
> [1124922408.145378] Saving memory pages: iter 21   0%
> Saving memory pages: iter 21   0%
>  21: sent 619, skipped 83, 
> 
>  21: sent 619, skipped 83, delta 229ms, dom0 23%, target 13%, sent 88Mb/s, 
> dirtied 94Mb/s 657 pages
> [1124922408.375295] Saving memory pages: iter 22   0%
> Saving memory pages: iter 22   0%
>  22: sent 651, skipped 0, 
> 
>  22: sent 651, skipped 0, delta 231ms, dom0 25%, target 2%, sent 92Mb/s, 
> dirtied 18Mb/s 130 pages
> [1124922408.606305] Saving memory pages: iter 23   0%
> Saving memory pages: iter 23   0%
>  23: sent 124, skipped 0, 
> 
>  23: sent 124, skipped 0, delta 45ms, dom0 31%, target 6%, sent 90Mb/s, 
> dirtied 120Mb/s 166 pages
> [1124922408.651758] Saving memory pages: iter 24   0%
> Saving memory pages: iter 24   0%
>  24: sent 159, skipped 0, 
> 
>  24: sent 159, skipped 0, delta 57ms, dom0 28%, target 8%, sent 91Mb/s, 
> dirtied 143Mb/s 249 pages
> [1124922408.709624] Saving memory pages: iter 25   0%
> Saving memory pages: iter 25   0%
>  25: sent 243, skipped 0, 
> 
>  25: sent 243, skipped 0, delta 102ms, dom0 21%, target 79%, sent 78Mb/s, 
> dirtied 57Mb/s 178 pages
> [1124922408.812289] Saving memory pages: iter 26   0%
> Saving memory pages: iter 26   0%
>  26: sent 168, skipped 4, 
> 
>  26: sent 168, skipped 4, delta 70ms, dom0 21%, target 78%, sent 78Mb/s, 
> dirtied 20Mb/s 43 pages
> [1124922408.883198] Saving memory pages: iter 27   0%
> Saving memory pages: iter 27   0%
>  27: sent 39, skipped 4, 
> 
>  27: sent 39, skipped 4, [DEBUG] Conn_sxpr>
> (xfr.err 22)[DEBUG] Conn_sxpr< err=0
> Retry suspend domain (0)
> ...repeated many times...
> Retry suspend domain (0)
> Retry suspend domain (0)
> Unable to suspend domain. (0)
> Unable to suspend domain. (0)
> Domain appears not to have suspended: 0
> Domain appears not to have suspended: 0
> 6638 [WRN] XFRD> Transfer errors:
> 6638 [WRN] XFRD> state=XFR_STATE    err=1
> 6638 [INF] XFRD> Xfr service err=1
> 
> -- Ray
> 
>        -----Original Message-----> 
>       From:   Cole, Ray  
>       Sent:   Wednesday, August 24, 2005 11:32 AM
>       To:     'xen-users@xxxxxxxxxxxxxxxxxxx'
>       Subject:        Re: [Xen-users] Live migration problem
> 
>       I'm having the exact same issue here with 2.0.7.  I'm using RedHat AS 4 
> on one machine, Fedora Core 4 on the other using xen 2.0.7 built from source. 
>  The guest OS is RH AS 4.  I'm also able to migrate a number of times in a 
> row without a problem, but periodically get the exact same errors that Steven 
> Yelton posted.  If my guest OS is more or less idle it seems to be less 
> likely to happen.  But if I start a big compile job and trying to do the 
> migrate it is more likely to fail.
> 
>       -- Ray

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.