|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 3/3] libxc/migrationv2: Split {start, end}_of_stream() to make checkpoint variants
On 11/05/15 03:31, Hongyang Yang wrote:
> On 05/08/2015 09:30 PM, Ian Campbell wrote:
>> On Fri, 2015-05-08 at 13:54 +0100, Andrew Cooper wrote:
>>> This is in preparation for supporting checkpointed streams in
>>> migration v2.
>>> - For PV guests, the VCPU context is moved to end_of_checkpoint().
>>> - For HVM guests, the HVM context and params are moved to
>>> end_of_checkpoint().
>>
>> [...]
>>> + /**
>>> + * Send records which need to be at the end of the checkpoint.
>>> This is
>>> + * called once, or once per checkpoint in a checkpointed
>>> stream, and is
>>> + * after the memory data.
>>> + */
>>> + int (*end_of_checkpoint)(struct xc_sr_context *ctx);
>>> +
>>> + /**
>>> + * Send records which need to be at the end of the stream.
>>> This is called
>>> + * once, before the END record is written.
>>> */
>>> int (*end_of_stream)(struct xc_sr_context *ctx);
>> [...]
>>> +static int x86_hvm_end_of_stream(struct xc_sr_context *ctx)
>>> +{
>>> + int rc;
>>> +
>>> + rc = write_tsc_info(ctx);
>>> if ( rc )
>>> return rc;
>>>
>>> - /* Write HVM_PARAMS record contains applicable HVM params. */
>>> - rc = write_hvm_params(ctx);
>>> +#ifdef XG_LIBXL_HVM_COMPAT
>>> + rc = write_toolstack(ctx);
>>
>> I'm not sure about this end_of_stream thing. In a check pointing for
>> fault tolerance scenario (Remus or COLO) then failover happens when the
>> sender has died for some reason, and therefore won't get the chance to
>> send any end of stream stuff.
>>
>> IOW I think everything in end_of_stream actually needs to be in
>> end_of_checkpoint unless it is just for informational purposes in a
>> regular migration or something (which write_toolstack surely isn't)
>
> Yes, all records should be sent at every checkpoint, except those
> only need to be sent once.
>
> checkpoint:
> You can see clearly from the patches a Remus migration explicit include
> two stage, first stage is live migration, the second is Checkpointed
> stream. The live migration is obvious, after the live migration, both
> primary and secondary are in the same state, the primary will continue
> to run until the next checkpoint, at checkpint, we sync the secondary
> state with the primary, so that both side are in the same state, so
> any record that could be changed while Guest is runing should be sent
> at checkpoint.
>
> failover:
> The handling of Checkpointed stream on restore side is also include
> two stage,
> first is buffer records, second is process records. This is because if
> master
> died when sending records, the secondary state will be inconsistent.
> So we
> have to make sure all records are received and then process the records.
> If master died, the secondary can recover from the last checkpoint state.
> Currently Remus failover relies on the migration channel. If the channel
> break, we presume master is dead, so we will failover. The "goto
> err_buf" is
> the failover path, with goto err_buf, we discard the current checkpoint
> records because it is imperfect, then resume the guest with last
> checkpoint
> state(the last processed records).
Thankyou for the clarification.
It occurs to me that, despite things like 'last_iter', it is actually
the first iteration which is actually special in Remus.
Is there a case where the primary decides to explicitly hand over to the
secondary?
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |