[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/3] libxc/migrationv2: Split {start, end}_of_stream() to make checkpoint variants





On 05/11/2015 05:02 PM, Andrew Cooper wrote:
On 11/05/15 03:31, Hongyang Yang wrote:
On 05/08/2015 09:30 PM, Ian Campbell wrote:
On Fri, 2015-05-08 at 13:54 +0100, Andrew Cooper wrote:
This is in preparation for supporting checkpointed streams in
migration v2.
   - For PV guests, the VCPU context is moved to end_of_checkpoint().
   - For HVM guests, the HVM context and params are moved to
end_of_checkpoint().

[...]
+    /**
+     * Send records which need to be at the end of the checkpoint.
This is
+     * called once, or once per checkpoint in a checkpointed
stream, and is
+     * after the memory data.
+     */
+    int (*end_of_checkpoint)(struct xc_sr_context *ctx);
+
+    /**
+     * Send records which need to be at the end of the stream.
This is called
+     * once, before the END record is written.
        */
       int (*end_of_stream)(struct xc_sr_context *ctx);
[...]
+static int x86_hvm_end_of_stream(struct xc_sr_context *ctx)
+{
+    int rc;
+
+    rc = write_tsc_info(ctx);
       if ( rc )
           return rc;

-    /* Write HVM_PARAMS record contains applicable HVM params. */
-    rc = write_hvm_params(ctx);
+#ifdef XG_LIBXL_HVM_COMPAT
+    rc = write_toolstack(ctx);

I'm not sure about this end_of_stream thing. In a check pointing for
fault tolerance scenario (Remus or COLO) then failover happens when the
sender has died for some reason, and therefore won't get the chance to
send any end of stream stuff.

IOW I think everything in end_of_stream actually needs to be in
end_of_checkpoint unless it is just for informational purposes in a
regular migration or something (which write_toolstack surely isn't)

Yes, all records should be sent at every checkpoint, except those
only need to be sent once.

checkpoint:
You can see clearly from the patches a Remus migration explicit include
two stage, first stage is live migration, the second is Checkpointed
stream. The live migration is obvious, after the live migration, both
primary and secondary are in the same state, the primary will continue
to run until the next checkpoint, at checkpint, we sync the secondary
state with the primary, so that both side are in the same state, so
any record that could be changed while Guest is runing should be sent
at checkpoint.

failover:
The handling of Checkpointed stream on restore side is also include
two stage,
first is buffer records, second is process records. This is because if
master
died when sending records, the secondary state will be inconsistent.
So we
have to make sure all records are received and then process the records.
If master died, the secondary can recover from the last checkpoint state.
Currently Remus failover relies on the migration channel. If the channel
break, we presume master is dead, so we will failover. The "goto
err_buf" is
the failover path, with goto err_buf, we discard the current checkpoint
records because it is imperfect, then resume the guest with last
checkpoint
state(the last processed records).

Thankyou for the clarification.

It occurs to me that, despite things like 'last_iter', it is actually
the first iteration which is actually special in Remus.

yes, the 'last_iter' thing is actually suspend and send the dirty mem
pages to secondary.


Is there a case where the primary decides to explicitly hand over to the
secondary?

Currently there isn't, The secondary only starts on failover.


~Andrew
.


--
Thanks,
Yang.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.