[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: LAST_CHECKPOINT and save/restore/migrate compatibility (was Re: [Xen-devel] [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end)
On Fri, 2011-04-01 at 12:02 +0100, George Dunlap wrote: > On Mon, Sep 6, 2010 at 11:03 AM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote: > > # HG changeset patch > > # User Ian Campbell <ian.campbell@xxxxxxxxxx> > > # Date 1283766891 -3600 > > # Node ID bdf8ce09160d715451e1204babe5f80886ea6183 > > # Parent 5f96f36feebdb87eaadbbcab0399f32eda86f735 > > libxc: provide notification of final checkpoint to restore end > > > > When the restore code sees this notification it will restore the > > currently in-progress checkpoint when it completes. > > > > This allows the restore end to finish up without waiting for a > > spurious timeout on the receive fd and thereby avoids unnecessary > > error logging in the case of a successful migration or restore. > > > > In the normal migration or restore case the first checkpoint is always > > the last. For a rolling checkpoint (such as Remus) the notification is > > currently unused but could be used in the future for example to > > provide a controlled failover for reasons other than error > > Unfortunatley, this introduces a backwards-compatibility problem, > since older versions of the tools never send LAST_CHECKPOINT. Out of interest what actually happens? (I think you mentioned something about failing when restoring from a file?) My intention at the time was that in the absence of a LAST_CHECKPOINT things would continue to behave as before e.g. you would see the spurious timeout on the receive fd but still restore correctly. Perhaps the associated changes to the O_NONBLOCK'ness of the fd at various stages is what broken this for the non-Remus migrations? > Would it make sense to invert the logic on this -- i.e., default to > only one checkpoint, and send "MORE_CHECKPOINTS" if it expects this > *not* to be the last (either by expecting the checkpoint() function to > do it, or by sending it from xc_domain_save)? I'd quite like to understand what I broke and investigate the trivialness (or otherwise) of a fix, but this change would probably independently make sense too. > Now that 4.1 is out, we have to consider version compatibilities. But > we should be able to make it so that there would only be if: > 1) You're running REMUS Is it the case that Remus only cares about checkpointing between like versions of the toolstack? > 2) One of your toolstacks is 4.1.0, and one is not. > > That seems like it shouldn't be too bad. > > Thoughts? > > -George > > > > > Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx> > > Acked-by: Brendan Cully <brendan@xxxxxxxxx> > > > > diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_restore.c > > --- a/tools/libxc/xc_domain_restore.c Mon Sep 06 10:54:51 2010 +0100 > > +++ b/tools/libxc/xc_domain_restore.c Mon Sep 06 10:54:51 2010 +0100 > > @@ -42,6 +42,7 @@ struct restore_ctx { > > xen_pfn_t *p2m; /* A table mapping each PFN to its new MFN. */ > > xen_pfn_t *p2m_batch; /* A table of P2M mappings in the current region. > > */ > > int completed; /* Set when a consistent image is available */ > > + int last_checkpoint; /* Set when we should commit to the current > > checkpoint when it completes. */ > > struct domain_info_context dinfo; > > }; > > > > @@ -765,6 +766,11 @@ static int pagebuf_get_one(xc_interface > > // DPRINTF("console pfn location: %llx\n", buf->console_pfn); > > return pagebuf_get_one(xch, ctx, buf, fd, dom); > > > > + case XC_SAVE_ID_LAST_CHECKPOINT: > > + ctx->last_checkpoint = 1; > > + // DPRINTF("last checkpoint indication received"); > > + return pagebuf_get_one(xch, ctx, buf, fd, dom); > > + > > default: > > if ( (count > MAX_BATCH_SIZE) || (count < 0) ) { > > ERROR("Max batch size exceeded (%d). Giving up.", count); > > @@ -1296,10 +1302,23 @@ int xc_domain_restore(xc_interface *xch, > > goto out; > > } > > ctx->completed = 1; > > - /* shift into nonblocking mode for the remainder */ > > - if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 ) > > - flags = 0; > > - fcntl(io_fd, F_SETFL, flags | O_NONBLOCK); > > + > > + /* > > + * If more checkpoints are expected then shift into > > + * nonblocking mode for the remainder. > > + */ > > + if ( !ctx->last_checkpoint ) > > + { > > + if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 ) > > + flags = 0; > > + fcntl(io_fd, F_SETFL, flags | O_NONBLOCK); > > + } > > + } > > + > > + if ( ctx->last_checkpoint ) > > + { > > + // DPRINTF("Last checkpoint, finishing\n"); > > + goto finish; > > } > > > > // DPRINTF("Buffered checkpoint\n"); > > diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_save.c > > --- a/tools/libxc/xc_domain_save.c Mon Sep 06 10:54:51 2010 +0100 > > +++ b/tools/libxc/xc_domain_save.c Mon Sep 06 10:54:51 2010 +0100 > > @@ -1616,6 +1616,20 @@ int xc_domain_save(xc_interface *xch, in > > } > > } > > > > + if ( !callbacks->checkpoint ) > > + { > > + /* > > + * If this is not a checkpointed save then this must be the first > > and > > + * last checkpoint. > > + */ > > + i = XC_SAVE_ID_LAST_CHECKPOINT; > > + if ( wrexact(io_fd, &i, sizeof(int)) ) > > + { > > + PERROR("Error when writing last checkpoint chunk"); > > + goto out; > > + } > > + } > > + > > /* Zero terminate */ > > i = 0; > > if ( wrexact(io_fd, &i, sizeof(int)) ) > > diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xg_save_restore.h > > --- a/tools/libxc/xg_save_restore.h Mon Sep 06 10:54:51 2010 +0100 > > +++ b/tools/libxc/xg_save_restore.h Mon Sep 06 10:54:51 2010 +0100 > > @@ -131,6 +131,7 @@ > > #define XC_SAVE_ID_TMEM_EXTRA -6 > > #define XC_SAVE_ID_TSC_INFO -7 > > #define XC_SAVE_ID_HVM_CONSOLE_PFN -8 /* (HVM-only) */ > > +#define XC_SAVE_ID_LAST_CHECKPOINT -9 /* Commit to restoring after > > completion of current iteration. */ > > > > /* > > ** We process save/restore/migrate in batches of pages; the below > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxxxxxxxx > > http://lists.xensource.com/xen-devel > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |