[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore
On Thu, 2015-05-14 at 14:17 +0100, Andrew Cooper wrote: > On 14/05/15 14:05, Ian Campbell wrote: > > On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote: > >> With Remus, the restore flow should be: > >> the first full migration stream -> { periodically restore stream } > >> > >> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx> > >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > >> CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx> > >> CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> > >> CC: Wei Liu <wei.liu2@xxxxxxxxxx> > >> --- > >> tools/libxc/xc_sr_common.h | 14 ++++++ > >> tools/libxc/xc_sr_restore.c | 113 > >> ++++++++++++++++++++++++++++++++++++++++---- > >> 2 files changed, 117 insertions(+), 10 deletions(-) > >> > >> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h > >> index f8121e7..3bf27f1 100644 > >> --- a/tools/libxc/xc_sr_common.h > >> +++ b/tools/libxc/xc_sr_common.h > >> @@ -208,6 +208,20 @@ struct xc_sr_context > >> /* Plain VM, or checkpoints over time. */ > >> bool checkpointed; > >> > >> + /* Currently buffering records between a checkpoint */ > >> + bool buffer_all_records; > >> + > >> +/* > >> + * With Remus, we buffer the records sent by the primary at checkpoint, > >> + * in case the primary will fail, we can recover from the last > >> + * checkpoint state. > >> + * This should be enough because primary only send dirty pages at > >> + * checkpoint. > > I'm not sure how it then follows that 1024 buffers is guaranteed to be > > enough, unless there is something on the sending side arranging it to be > > so? > > > >> + */ > >> +#define MAX_BUF_RECORDS 1024 > >> + struct xc_sr_record *buffered_records; > >> + unsigned buffered_rec_num; > >> + > >> /* > >> * Xenstore and Console parameters. > >> * INPUT: evtchn & domid > >> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c > >> index 9ab5760..8468ffc 100644 > >> --- a/tools/libxc/xc_sr_restore.c > >> +++ b/tools/libxc/xc_sr_restore.c > >> @@ -468,11 +468,69 @@ static int handle_page_data(struct xc_sr_context > >> *ctx, struct xc_sr_record *rec) > >> return rc; > >> } > >> > >> +static int process_record(struct xc_sr_context *ctx, struct xc_sr_record > >> *rec); > >> +static int handle_checkpoint(struct xc_sr_context *ctx) > >> +{ > >> + xc_interface *xch = ctx->xch; > >> + int rc = 0; > >> + unsigned i; > >> + > >> + if ( !ctx->restore.checkpointed ) > >> + { > >> + ERROR("Found checkpoint in non-checkpointed stream"); > >> + rc = -1; > > Is it usual in migrv2 to set errno as well? > > If a relevant errno is to be had. EINVAL or ENOSYS perhaps? > > There are a lot of cases which are waiting for some real libxc error > codes before they can propagate numeric error information, although in > all cases the log messages will be accurate (and hopefully helpful). > > ~Andrew > > > > >> + goto err; > >> + } > >> + > >> + if ( ctx->restore.buffer_all_records ) > >> + { > >> + IPRINTF("All records buffered"); > >> + > >> + /* > >> + * We need to set buffer_all_records to false in > >> + * order to process records instead of buffer records. > >> + * buffer_all_records should be set back to true after > >> + * we successfully processed all records. > >> + */ > >> + ctx->restore.buffer_all_records = false; > > I'm not personally a fan of changing global state in order to simulate > > the action of what should be a parameter to a function. > > > > Preferable IMHO would be to have process_record gain a parameter to > > override the ctx state but become an internal helper (perhaps with a > > name change) and then have API function process_record and > > process_buffered_records or some such which call it in the right way. > > > > Andy may have a differing opinion though. > > Hmm yes - it would be nice to split the buffering logic away from the > processing logic. > > However, the two are slightly related. > > Perhaps a process_or_buffer_record() helper, and removing all buffering > logic from process_record(). That seems like it would work. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |