[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 3/3] libxc/migrationv2: Split {start, end}_of_stream() to make checkpoint variants
On 11/05/15 03:31, Hongyang Yang wrote: > On 05/08/2015 09:30 PM, Ian Campbell wrote: >> On Fri, 2015-05-08 at 13:54 +0100, Andrew Cooper wrote: >>> This is in preparation for supporting checkpointed streams in >>> migration v2. >>> - For PV guests, the VCPU context is moved to end_of_checkpoint(). >>> - For HVM guests, the HVM context and params are moved to >>> end_of_checkpoint(). >> >> [...] >>> + /** >>> + * Send records which need to be at the end of the checkpoint. >>> This is >>> + * called once, or once per checkpoint in a checkpointed >>> stream, and is >>> + * after the memory data. >>> + */ >>> + int (*end_of_checkpoint)(struct xc_sr_context *ctx); >>> + >>> + /** >>> + * Send records which need to be at the end of the stream. >>> This is called >>> + * once, before the END record is written. >>> */ >>> int (*end_of_stream)(struct xc_sr_context *ctx); >> [...] >>> +static int x86_hvm_end_of_stream(struct xc_sr_context *ctx) >>> +{ >>> + int rc; >>> + >>> + rc = write_tsc_info(ctx); >>> if ( rc ) >>> return rc; >>> >>> - /* Write HVM_PARAMS record contains applicable HVM params. */ >>> - rc = write_hvm_params(ctx); >>> +#ifdef XG_LIBXL_HVM_COMPAT >>> + rc = write_toolstack(ctx); >> >> I'm not sure about this end_of_stream thing. In a check pointing for >> fault tolerance scenario (Remus or COLO) then failover happens when the >> sender has died for some reason, and therefore won't get the chance to >> send any end of stream stuff. >> >> IOW I think everything in end_of_stream actually needs to be in >> end_of_checkpoint unless it is just for informational purposes in a >> regular migration or something (which write_toolstack surely isn't) > > Yes, all records should be sent at every checkpoint, except those > only need to be sent once. > > checkpoint: > You can see clearly from the patches a Remus migration explicit include > two stage, first stage is live migration, the second is Checkpointed > stream. The live migration is obvious, after the live migration, both > primary and secondary are in the same state, the primary will continue > to run until the next checkpoint, at checkpint, we sync the secondary > state with the primary, so that both side are in the same state, so > any record that could be changed while Guest is runing should be sent > at checkpoint. > > failover: > The handling of Checkpointed stream on restore side is also include > two stage, > first is buffer records, second is process records. This is because if > master > died when sending records, the secondary state will be inconsistent. > So we > have to make sure all records are received and then process the records. > If master died, the secondary can recover from the last checkpoint state. > Currently Remus failover relies on the migration channel. If the channel > break, we presume master is dead, so we will failover. The "goto > err_buf" is > the failover path, with goto err_buf, we discard the current checkpoint > records because it is imperfect, then resume the guest with last > checkpoint > state(the last processed records). Thankyou for the clarification. It occurs to me that, despite things like 'last_iter', it is actually the first iteration which is actually special in Remus. Is there a case where the primary decides to explicitly hand over to the secondary? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |