[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore

To: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
From: Ian Campbell <ian.campbell@xxxxxxxxxx>
Date: Thu, 14 May 2015 14:05:16 +0100
Cc: wei.liu2@xxxxxxxxxx, eddie.dong@xxxxxxxxx, wency@xxxxxxxxxxxxxx, andrew.cooper3@xxxxxxxxxx, yunhong.jiang@xxxxxxxxx, ian.jackson@xxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxx, guijianfeng@xxxxxxxxxxxxxx, rshriram@xxxxxxxxx
Delivery-date: Thu, 14 May 2015 13:06:05 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote:
> With Remus, the restore flow should be:
> the first full migration stream -> { periodically restore stream }
> 
> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
> CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
> ---
>  tools/libxc/xc_sr_common.h  |  14 ++++++
>  tools/libxc/xc_sr_restore.c | 113 
> ++++++++++++++++++++++++++++++++++++++++----
>  2 files changed, 117 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> index f8121e7..3bf27f1 100644
> --- a/tools/libxc/xc_sr_common.h
> +++ b/tools/libxc/xc_sr_common.h
> @@ -208,6 +208,20 @@ struct xc_sr_context
>              /* Plain VM, or checkpoints over time. */
>              bool checkpointed;
>  
> +            /* Currently buffering records between a checkpoint */
> +            bool buffer_all_records;
> +
> +/*
> + * With Remus, we buffer the records sent by the primary at checkpoint,
> + * in case the primary will fail, we can recover from the last
> + * checkpoint state.
> + * This should be enough because primary only send dirty pages at
> + * checkpoint.

I'm not sure how it then follows that 1024 buffers is guaranteed to be
enough, unless there is something on the sending side arranging it to be
so?

> + */
> +#define MAX_BUF_RECORDS 1024
> +            struct xc_sr_record *buffered_records;
> +            unsigned buffered_rec_num;
> +
>              /*
>               * Xenstore and Console parameters.
>               * INPUT:  evtchn & domid
> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
> index 9ab5760..8468ffc 100644
> --- a/tools/libxc/xc_sr_restore.c
> +++ b/tools/libxc/xc_sr_restore.c
> @@ -468,11 +468,69 @@ static int handle_page_data(struct xc_sr_context *ctx, 
> struct xc_sr_record *rec)
>      return rc;
>  }
>  
> +static int process_record(struct xc_sr_context *ctx, struct xc_sr_record 
> *rec);
> +static int handle_checkpoint(struct xc_sr_context *ctx)
> +{
> +    xc_interface *xch = ctx->xch;
> +    int rc = 0;
> +    unsigned i;
> +
> +    if ( !ctx->restore.checkpointed )
> +    {
> +        ERROR("Found checkpoint in non-checkpointed stream");
> +        rc = -1;

Is it usual in migrv2 to set errno as well?

> +        goto err;
> +    }
> +
> +    if ( ctx->restore.buffer_all_records )
> +    {
> +        IPRINTF("All records buffered");
> +
> +        /*
> +         * We need to set buffer_all_records to false in
> +         * order to process records instead of buffer records.
> +         * buffer_all_records should be set back to true after
> +         * we successfully processed all records.
> +         */
> +        ctx->restore.buffer_all_records = false;

I'm not personally a fan of changing global state in order to simulate
the action of what should be a parameter to a function.

Preferable IMHO would be to have process_record gain a parameter to
override the ctx state but become an internal helper (perhaps with a
name change) and then have API function process_record and
process_buffered_records or some such which call it in the right way.

Andy may have a differing opinion though.




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore
  - From: Yang Hongyang
- Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore
  - From: Andrew Cooper

References:
- [Xen-devel] [PATCH Remus v5 0/2] Remus support for Migration-v2
  - From: Yang Hongyang
- [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore
  - From: Yang Hongyang

Prev by Date: [Xen-devel] Kernel panic, reboot in 5 seconds
Next by Date: [Xen-devel] [linux-3.4 test] 55878: regressions - FAIL
Previous by thread: Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore
Next by thread: Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.