[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save





On 06/09/2015 03:20 PM, Andrew Cooper wrote:
On 09/06/2015 04:15, Yang Hongyang wrote:


On 06/08/2015 09:04 PM, Andrew Cooper wrote:
On 08/06/15 04:45, Yang Hongyang wrote:
call callbacks->get_dirty_pfn() after suspend primary vm to
get dirty pages on secondary vm, and send pages both dirty on
primary/secondary to secondary.

Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
Signed-off-by: Wen Congyang <wency@xxxxxxxxxxxxxx>
CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
   tools/libxc/xc_sr_save.c | 49
+++++++++++++++++++++++++++++++++++++++++++++++-
   1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d63b783..cda61ed 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -515,6 +515,31 @@ static int send_memory_live(struct
xc_sr_context *ctx)
       return rc;
   }

+static int update_dirty_bitmap(uint8_t *(*get_dirty_pfn)(void *),
void *data,
+                               unsigned long p2m_size, unsigned
long *bitmap)

This function should take a ctx rather than having the caller expand 3
parameters.  Also, "update_dirty_bitmap" is a little misleading, as it
isn't querying the hypervisor for the dirty bitmap.

ok.

(Merging the other thread)

how about merge_secondary_dirty_bitmap()?

Much better!



+{
+    uint64_t *pfn_list;
+    uint64_t count, i;
+    uint64_t pfn;
+
+    pfn_list = (uint64_t *)get_dirty_pfn(data);

This looks like a recipe for width-errors.  The get_dirty_pfn() call
should take a pointer to a struct for it to fill.

but the size is unknown for the caller.pfn_list[0] is the count of
pfn.


+    assert(pfn_list);

This should turn into an error rather than an abort().

Even if there are no dirty pages on secondary, pfn_list shouldn't be
NULL, it's just that pfn_list[0] will be 0. if pfn_list is NULL,
there might be unexpected error happened.

get_dirty_pfn() should be declared alongside a

struct pfn_data
{
     uint64_t count;
     uint64_t *pfns;
};

and this function here should create one of these on the stack and pass
it by pointer to get_dirty_pfn().  I might also be tempted to rename
this to get_remote_logdirty() or similar, to indicate that it is a
source of logdirty data from something other than the current hypervisor.

This is a callback, I can't find a way to pass pointer from libxc to libxl,
libxl can not access the pointer data...The struct can be used for represent
the data however.

I like with the rename part, sounds much better.




+
+    count = pfn_list[0];
+    for (i = 0; i < count; i++) {

style

+        pfn = pfn_list[i + 1];
+        if (pfn > p2m_size) {
+            errno = EINVAL;
+            return -1;
+        }
+
+        set_bit(pfn, bitmap);
+    }
+
+    free(pfn_list);
+    return 0;
+}
+
   /*
    * Suspend the domain and send dirty memory.
    * This is the last iteration of the live migration and the
@@ -555,6 +580,19 @@ static int suspend_and_send_dirty(struct
xc_sr_context *ctx)

       bitmap_or(dirty_bitmap, ctx->save.deferred_pages,
ctx->save.p2m_size);

+    if ( !ctx->save.live && ctx->save.callbacks->get_dirty_pfn )
+    {

Shouldn't get_dirty_pfn be mandatory for COLO streams (even if it is a
noop to start with) ?

It should be mandatory, it shouldn't be noop under COLO. perhaps we
should
add sanity check at the beginning. But problem is save side do not
have a param
passed from libxl to indicate the stream type(like checkpointed_stream in
restore side). So we may need to add another XCFLAGS? Currently there is
XCFLAGS_CHECKPOINTED which represents Remus, we might need to change
this to
XCFLAGS_STREAM_REMUS
XCFLAGS_STREAM_COLO
so that we can know what kind of stream we are handling?

checkpointed_stream started out as a bugfix for a legacy stream
migration breakage.  Really, this information should have been passed
right from the start.

Did I miss the bugfix? is it not in upstream?


It would probably be best to take the enum{} suggested elsewhere and
make it a top level ctx item, and have it present for both save and
restore, with sutable parameters passed in from the top.  (When I am
finally able to take out the legacy code, there is going to be a severe
pruning/consolidation of the parameters.)

This is what I thought when I saw the enum{} suggested.


~Andrew



~Andrew

+        rc = update_dirty_bitmap(ctx->save.callbacks->get_dirty_pfn,
+                                 ctx->save.callbacks->data,
+                                 ctx->save.p2m_size,
+                                 dirty_bitmap);
+        if ( rc )
+        {
+            PERROR("Failed to get secondary vm's dirty pages");
+            goto out;
+        }
+    }
+
       rc = send_dirty_pages(ctx, stats.dirty_count +
ctx->save.nr_deferred_pages);
       if ( rc )
           goto out;
@@ -784,7 +822,16 @@ static int save(struct xc_sr_context *ctx,
uint16_t guest_type)
               if ( rc )
                   goto err;

-            ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+            rc =
ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+            if ( !rc ) {
+                if ( !errno )
+                {
+                    /* Postcopy request failed (without errno,
using EINVAL) */
+                    errno = EINVAL;
+                }
+                rc = -1;
+                goto err;
+            }

               rc =
ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
               if ( rc <= 0 )

.



.


--
Thanks,
Yang.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.