[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 2/3] remus: implement remus checkpoint in v2 save



On 10/07/14 04:25, Hongyang Yang wrote:
>
>
> On 07/09/2014 06:53 PM, Andrew Cooper wrote:
>> On 09/07/14 08:47, Yang Hongyang wrote:
>>> implement remus checkpoint in v2 save
>>>
>>> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
>>> ---
>>>   tools/libxc/saverestore/common.h |  1 +
>>>   tools/libxc/saverestore/save.c   | 88
>>> ++++++++++++++++++++++++----------------
>>>   2 files changed, 55 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/tools/libxc/saverestore/common.h
>>> b/tools/libxc/saverestore/common.h
>>> index 24ba95b..1dd9f51 100644
>>> --- a/tools/libxc/saverestore/common.h
>>> +++ b/tools/libxc/saverestore/common.h
>>> @@ -153,6 +153,7 @@ struct xc_sr_context
>>>
>>>       xc_dominfo_t dominfo;
>>>       bool checkpointed;
>>> +    bool firsttime;
>>
>> This is also only used on the save side.
>
> Yes, the restore side won't use this by now, but I'm not sure it will
> be used later, maybe it can be moved to .save union now, and when we need
> to use it in restore side, we can then move it out.

I would prefer things like this to move into the most specific place
they can live.  It helps spot issues.  e.g. it is obvious that anything
using ctx->save.$FOO on the restore path is wrong.

>
> In v2, the checkpointed_stream parameter in xc_domain_restore() seems
> not necessary, can we remove it? cause remove it will breaks the API...

libxc is free to change.  I plan to drop as many arguments as possible
when the legacy migration code is removed.

>
>>
>>>
>>>       union
>>>       {
>>> diff --git a/tools/libxc/saverestore/save.c
>>> b/tools/libxc/saverestore/save.c
>>> index d2fa8a6..98a5c2f 100644
>>> --- a/tools/libxc/saverestore/save.c
>>> +++ b/tools/libxc/saverestore/save.c
>>> @@ -375,6 +375,8 @@ static int send_domain_memory_live(struct
>>> xc_sr_context *ctx)
>>>           goto out;
>>>       }
>>>
>>> +    if ( ctx->checkpointed && !ctx->firsttime )
>>> +        goto lastiter;
>>>       /* This juggling is required if logdirty is already on, e.g.
>>> VRAM tracking */
>>>       if ( xc_shadow_control(xch, ctx->domid,
>>>                              XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY,
>>> @@ -436,6 +438,7 @@ static int send_domain_memory_live(struct
>>> xc_sr_context *ctx)
>>>               break;
>>>       }
>>>
>>> +lastiter:
>>>       rc = suspend_domain(ctx);
>>>       if ( rc )
>>>           goto out;
>>> @@ -570,44 +573,60 @@ static int save(struct xc_sr_context *ctx,
>>> uint16_t guest_type)
>>>       if ( rc )
>>>           goto err;
>>>
>>> -    rc = ctx->save.ops.start_of_stream(ctx);
>>> -    if ( rc )
>>> -        goto err;
>>> +    do {
>>> +        rc = ctx->save.ops.start_of_stream(ctx);
>>> +        if ( rc )
>>> +            goto err;
>>
>> I am not sure start_of_stream() wants to be inside the loop.  For PV
>> guests, it sends the X86_PV_INFO which is only expected to be sent
>> once.  The X86_PV_P2M_FRAMES record is deliberately safe to send
>> multiple times (in the hope that someone might evenutally fix the
>> ballooning issues), but is a waste of time to send like this, as its
>> content wont be changing.
>
> It you make sure all records that has been sent in start_of_stream()
> wont be changing, then we can surely move this out of the loop.

The X86_PV_INFO record must only be sent once.  Sending it repeatedly
with the same contents is not disasterous, but sending it with different
contents certainly is.

There is nothing the restorer can do other than bail if it sees that the
sending has switches between being a 32bit or a 64bit VM.


If you need some non-memory records resending at the start of each
checkpoint iteration then feel free to add in a new function to
save_ops, but currently I don't believe it is needed.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.