[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc

To: Wen Congyang <wency@xxxxxxxxxxxxxx>
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Mon, 25 Jan 2016 14:41:47 -0500
Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>, Changlong Xie <xiecl.fnst@xxxxxxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jiang Yunhong <yunhong.jiang@xxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxx>, Dong Eddie <eddie.dong@xxxxxxxxx>, Gui Jianfeng <guijianfeng@xxxxxxxxxxxxxx>, Shriram Rajagopalan <rshriram@xxxxxxxxx>, Yang Hongyang <hongyang.yang@xxxxxxxxxxxx>
Delivery-date: Mon, 25 Jan 2016 19:42:40 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, Dec 30, 2015 at 10:29:02AM +0800, Wen Congyang wrote:
> In COLO mode, both VMs are running, and are considered in sync if the
> visible network traffic is identical.  After some time, they fall out of
> sync.
> 
> At this point, the two VMs have definitely diverged.  Lets call the
> primary dirty bitmap set A, while the secondary dirty bitmap set B.
> 
> Sets A and B are different.
> 
> Under normal migration, the page data for set A will be sent form the

s/form/from/

> primary to the secondary.
> 
> However, the set difference B - A (lets call this C) is out-of-date on
> the secondary (with respect to the primary) and will not be sent by the
> primary, as it was not memory dirtied by the primary.  The secondary

s/primary/primary (to secondary)/

> needs the page data for C to reconstruct an exact copy of the primary at

s/the page data/C page data/

> the checkpoint.
> 
> The secondary cannot calculate C as it doesn't know A.  Instead, the
> secondary must send B to the primary, at which point the primary
> calculates the union of A and B (lets call this D) which is all the
> pages dirtied by both the primary and the secondary, and sends all page
> data covered by D.

You could invert this - the primary could send A to secondary? I presume
this non-optimal as the 'A' set is much much bigger than 'C' set?

It may be good to include this in the commit description.

> 
> In the general case, D is a superset of both A and B.  Without the
> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
> copy of the primary.
> 
> We transfer the dirty bitmap on libxc side, so we need to introduce back
> channel to libxc.

> 
> Note: it is different from the paper. We change the original design to
> the current one, according to our following concerns:
> 1. The original design needs extra memory on Secondary host. When there's
>    multiple backups on one host, the memory cost is high.
> 2. The memory cache code will be another 1k+, it will make the review
>    more time consuming.

Well, that 2) is a very good reason :-)
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@xxxxxxxxxxxx>
> commit message:

? Huh?

> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
> CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
> CC: Wei Liu <wei.liu2@xxxxxxxxxx>

.. snip..
> index 05159bb..d4dc501 100644
> --- a/tools/libxc/xc_sr_restore.c
> +++ b/tools/libxc/xc_sr_restore.c
> @@ -722,7 +722,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, 
> uint32_t dom,
>                        unsigned long *console_gfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
>                        int checkpointed_stream,
> -                      struct restore_callbacks *callbacks)
> +                      struct restore_callbacks *callbacks, int back_fd)
>  {
>      struct xc_sr_context ctx =
>          {
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 8ffd71d..a49d083 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -824,7 +824,7 @@ static int save(struct xc_sr_context *ctx, uint16_t 
> guest_type)
>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
>                     uint32_t max_iters, uint32_t max_factor, uint32_t flags,
>                     struct save_callbacks* callbacks, int hvm,
> -                   int checkpointed_stream)
> +                   int checkpointed_stream, int back_fd)
>  {
>      struct xc_sr_context ctx =
>          {


But where is the code?

Or is that suppose to be done in another patch? If so you may want to
mention that in the commit description?


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
  - From: Wen Congyang

Prev by Date: [Xen-devel] [libvirt test] 78978: tolerable FAIL - PUSHED
Next by Date: Re: [Xen-devel] [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device
Previous by thread: [Xen-devel] [libvirt test] 78978: tolerable FAIL - PUSHED
Next by thread: Re: [Xen-devel] [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.