[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] tools/migrate: Fix regression when migrating from older version of Xen



On 11/07/13 10:55, Ian Campbell wrote:
> On Wed, 2013-07-10 at 20:19 +0100, Andrew Cooper wrote:
>> Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
>>   "libxc: provide notification of final checkpoint to restore end"
>> broke migration from any version of Xen using tools from prior to that commit
>>
>> Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
>> tools xc_domain_restore() to start reading the qemu save record, as
>> ctx->last_checkpoint is 0.
> Can the receive not distinguish the case where it is receiving a
> checkpoint stream from a regular one? Either via some property of the
> stream or the parameters with which it was called?
>
> ctx->last_checkpoint should (be made to) only matter if you are doing
> check pointing and I think we can mandate that checking pointing is only
> supported between like versions of Xen. TBH it's a bit odd to send the
> LAST_CHECKPOINT in anon-checkpoint stream anyway, but what's done is
> done.
>
> Ian.

Sending LAST_CHECKPOINT is the only thing which has allowed normal
migration to work since this regression.

I cant find any way of distinguishing a normal vs checkpointed migrate
from the stream alone.  The save_callback->checkpoint function pointer
may or may not be filled in by a toolstack, but is completely
out-of-band as far as the migration stream is concerned.

~Andrew

>> The failure looks like:
>>   xc: error: Max batch size exceeded (1970103633). Giving up.
>> where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
>>
>> Sadly, the simple fix of just setting ctx->last_checkpoint = 1 will cause an
>> opposite function regresson for anyone using the current behaviour of
>> save_callbacks->checkpoint().
>>
>> The only safe fix is to rely on the toolstack to provide this information.
>>
>> Passing 0 results in unchanged behaviour, while passing nonzero means "the
>> other end of the migration stream does not know about
>> XC_SAVE_ID_LAST_CHECKPOINT but is performing a normal migrate"
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>>  tools/libxc/xc_domain_restore.c |    3 ++-
>>  tools/libxc/xc_nomigrate.c      |    2 +-
>>  tools/libxc/xenguest.h          |    4 +++-
>>  tools/libxl/libxl_save_helper.c |    2 +-
>>  tools/xcutils/xc_restore.c      |    2 +-
>>  5 files changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/libxc/xc_domain_restore.c 
>> b/tools/libxc/xc_domain_restore.c
>> index 63d36cd..40c9282 100644
>> --- a/tools/libxc/xc_domain_restore.c
>> +++ b/tools/libxc/xc_domain_restore.c
>> @@ -1401,7 +1401,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, 
>> uint32_t dom,
>>                        domid_t store_domid, unsigned int console_evtchn,
>>                        unsigned long *console_mfn, domid_t console_domid,
>>                        unsigned int hvm, unsigned int pae, int superpages,
>> -                      int no_incr_generationid,
>> +                      int no_incr_generationid, int last_checkpoint_unaware,
>>                        unsigned long *vm_generationid_addr,
>>                        struct restore_callbacks *callbacks)
>>  {
>> @@ -1472,6 +1472,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, 
>> uint32_t dom,
>>  
>>      ctx->superpages = superpages;
>>      ctx->hvm = hvm;
>> +    ctx->last_checkpoint = !!last_checkpoint_unaware;
>>  
>>      ctxt = xc_hypercall_buffer_alloc(xch, ctxt, sizeof(*ctxt));
>>  
>> diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
>> index 73e7566..1d3aec5 100644
>> --- a/tools/libxc/xc_nomigrate.c
>> +++ b/tools/libxc/xc_nomigrate.c
>> @@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, 
>> uint32_t dom,
>>                        domid_t store_domid, unsigned int console_evtchn,
>>                        unsigned long *console_mfn, domid_t console_domid,
>>                        unsigned int hvm, unsigned int pae, int superpages,
>> -                      int no_incr_generationid,
>> +                      int no_incr_generationid, int last_checkpoint_unaware,
>>                        unsigned long *vm_generationid_addr,
>>                        struct restore_callbacks *callbacks)
>>  {
>> diff --git a/tools/libxc/xenguest.h b/tools/libxc/xenguest.h
>> index 4714bd2..977dde3 100644
>> --- a/tools/libxc/xenguest.h
>> +++ b/tools/libxc/xenguest.h
>> @@ -114,6 +114,8 @@ struct restore_callbacks {
>>   * @parm pae non-zero if this HVM domain has PAE support enabled
>>   * @parm superpages non-zero to allocate guest memory with superpages
>>   * @parm no_incr_generationid non-zero if generation id is NOT to be 
>> incremented
>> + * @parm last_checkpoint_unaware non-zero if far end of the migration is 
>> unaware
>> + *       of the XC_SAVE_ID_LAST_CHECKPOINT hunk in the migration stream.
>>   * @parm vm_generationid_addr returned with the address of the generation 
>> id buffer
>>   * @parm callbacks non-NULL to receive a callback to restore toolstack
>>   *       specific data
>> @@ -124,7 +126,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, 
>> uint32_t dom,
>>                        domid_t store_domid, unsigned int console_evtchn,
>>                        unsigned long *console_mfn, domid_t console_domid,
>>                        unsigned int hvm, unsigned int pae, int superpages,
>> -                      int no_incr_generationid,
>> +                      int no_incr_generationid, int last_checkpoint_unaware,
>>                        unsigned long *vm_generationid_addr,
>>                        struct restore_callbacks *callbacks);
>>  /**
>> diff --git a/tools/libxl/libxl_save_helper.c 
>> b/tools/libxl/libxl_save_helper.c
>> index 772251a..8f14b8b 100644
>> --- a/tools/libxl/libxl_save_helper.c
>> +++ b/tools/libxl/libxl_save_helper.c
>> @@ -264,7 +264,7 @@ int main(int argc, char **argv)
>>          r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
>>                                store_domid, console_evtchn, &console_mfn,
>>                                console_domid, hvm, pae, superpages,
>> -                              no_incr_genidad, &genidad,
>> +                              no_incr_genidad, 0, &genidad,
>>                                &helper_restore_callbacks);
>>          helper_stub_restore_results(store_mfn,console_mfn,genidad,0);
>>          complete(r);
>> diff --git a/tools/xcutils/xc_restore.c b/tools/xcutils/xc_restore.c
>> index 35d725c..e657c7f 100644
>> --- a/tools/xcutils/xc_restore.c
>> +++ b/tools/xcutils/xc_restore.c
>> @@ -52,7 +52,7 @@ main(int argc, char **argv)
>>  
>>      ret = xc_domain_restore(xch, io_fd, domid, store_evtchn, &store_mfn, 0,
>>                              console_evtchn, &console_mfn, 0, hvm, pae, 
>> superpages,
>> -                            0, NULL, NULL);
>> +                            0, 0, NULL, NULL);
>>  
>>      if ( ret == 0 )
>>      {
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.