Xen project Mailing List

On Thu, Jul 18, 2013 at 12:27 PM, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:

On 18/07/13 17:20, Shriram Rajagopalan wrote:

On Tue, Jul 16, 2013 at 6:52 AM, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:

Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
"libxc: provide notification of final checkpoint to restore end"
broke migration from any version of Xen using tools from prior to that commit

Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
tools xc_domain_restore() to start reading the qemu save record, as
ctx->last_checkpoint is 0.

The failure looks like:
xc: error: Max batch size exceeded (1970103633). Giving up.
where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"

Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
opposite function regresson for anyone using the current behaviour of
save_callbacks->checkpoint().

The only safe fix is to rely on the toolstack to provide this information.

Passing 0 results in unchanged behaviour, while passing nonzero means "the
other end of the migration stream does not know about
XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"

Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

---

Changes since v1:
* Rename "last_checkpoint_unaware" to "checkpointed_stream" to be more
generic yet still accurate as to the meaning and fix for the issue

---
tools/libxc/xc_domain_restore.c | 3 ++-
tools/libxc/xc_nomigrate.c | 2 +-

tools/libxc/xenguest.h | 3 ++-

tools/libxl/libxl_save_helper.c | 2 +-
tools/xcutils/xc_restore.c | 2 +-
5 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c

index 63d36cd..e50b185 100644

--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
domid_t store_domid, unsigned int console_evtchn,
unsigned long *console_mfn, domid_t console_domid,
unsigned int hvm, unsigned int pae, int superpages,
- int no_incr_generationid,

+ int no_incr_generationid, int checkpointed_stream,

unsigned long *vm_generationid_addr,
struct restore_callbacks *callbacks)
{
@@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,

ctx->superpages = superpages;
ctx->hvm = hvm;

+ ctx->last_checkpoint = !!checkpointed_stream;

shouldnt it be a single unary NOT ?

i.e., ctx->last_checkpoint = !checkpointed_stream;

It should, which reminds me why I had it named the way it was before. I didn't think enough before running sed to change the name as per Ians suggestion.

So the !! is a bug, then ..

ctx->last_checkpoint = !checkpointed_stream;

I know xend is deprecated but xend+remus is currently the only working version of remus.

and this patch, especially setting the value to 0 in xc_restore.c breaks xend-remus.

We could atleast mandate that migrations/remus in Xend will work only between same version of toolstack.

With the current naming, you have to set checpointed_stream=1 to get the current behaviour, which means "possibly checkpointed or possibly not but new enough that a non-checkpointed stream sends a LAST_CHECKPOINT anyway"

Otherwise, I am a bit confused. Basically, there is a memset(ctx, 0) right before this.

And you seem to be passing checkpointed_stream = 0 via both xc_restore and

libxl which basically sets it back to ctx->last_checkpoint = 0.

Also, after getting pages from the last iteration, the code goes like this:

if (ctx->last_checkpoint)

goto finish;

get pagebuf or goto finish on error

get tailbuf or goto finish on error

goto loadpages;

finish:

...

and you setting ctx->last_checkpoint = 0 basically means that you are

banking on the far end (with older version of tools) to close the socket, causing

"get pagebuf" to fail and subsequently land on finish.

IIRC, this was the behavior before XC_SAVE_ID_LAST_CHECKPOINT was introduced

by Ian Campbell, to get rid of this benign error message.

That might have been 'fine' for PV guests, but it cause active breakage for HVM domains where the qemu save record immediately follows in the migration stream.

Just to clarify.. the code flows like this, iirc.

loadpages:

while (1)

if !completed

get pagebufs

if 0 pages sent, break

endif

apply batch (pagebufs)

endwhile

if !completed

get tailbuf [[this is where the QEMU record would be obtained]]

completed = 1

endif

if last_checkpoint

goto finish

endif

get pagebuf or goto finish on error ---> this is where old code used to exit

get tailbuf

goto loadpages

finish:

apply tailbuf [tailbuf obtained inside the 'if !completed' block]

do the rest of the restore

So are you sure that the max batch size error is a result of that XC_SAVE_ID_LAST_CHECKPOINT ?

After all, its an "option" thats received from the remote end. Not a mandatory parameter.

If we revert to the old style of control flow, like the patch below, do you still get the error ?

diff -r a90bf2537141 tools/libxc/xc_domain_restore.c

--- a/tools/libxc/xc_domain_restore.c Mon Jun 10 21:06:47 2013 -0400

+++ b/tools/libxc/xc_domain_restore.c Thu Jul 18 14:44:04 2013 -0400

@@ -1628,7 +1628,6 @@

* If more checkpoints are expected then shift into

* nonblocking mode for the remainder.

- if ( !ctx->last_checkpoint )

fcntl(io_fd, F_SETFL, orig_io_fd_flags | O_NONBLOCK);

Re: [Xen-devel] [Patch v2] tools/migrate: Fix regression when migrating from older version of Xen