Re: [Xen-devel] [PATCH 09/29] [HACK] tools/libxc: save/restore v2 framework

On Mon, Sep 15, 2014 at 04:09:51PM +0100, Andrew Cooper wrote:
> On 14/09/2014 11:23, Shriram Rajagopalan wrote:
> >On Sep 11, 2014 4:08 AM, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx
> ><mailto:andrew.cooper3@xxxxxxxxxx>> wrote:
> >>
> >> On 11/09/14 12:01, Ian Campbell wrote:
> >> > On Thu, 2014-09-11 at 11:37 +0100, Andrew Cooper wrote:
> >> >> On 11/09/14 11:34, Ian Campbell wrote:
> >> >>> On Wed, 2014-09-10 at 18:10 +0100, Andrew Cooper wrote:
> >> >>>> For testing purposes, the environmental variable
> >"XG_MIGRATION_V2" allows the
> >> >>>> two save/restore codepaths to coexist, and have a runtime switch.
> >> >>>>
> >> >>>> It is indended that once this series is less RFC, the v2
> >framework will
> >> >>>> completely replace v1.
> >> >>> I think we are now at the point where this hack needs to be
> >dropped from
> >> >>> the series.
> >> >> One problem is remus.  My plan when dropping this patch was to

The other is 'tmem'. But 'tmem' has not yet been declared 'baked' so
not making it work from a release perspective is OK.

With the 'tmem' maintainer hat on, however I would like to it work without
having to do anything :-) Which reminds me I need to follow up
on double-checking the migation hasn't bitrotten!

> >drop all
> >> >> of xc_domain_{save/restore}.c as well, but without remus migration-v2
> >> >> support available, this will break existing set-ups.

And by 'set-ups' you mean Xen 4.5 using the v1 migration tools and then
out of tree patches on top of that. In other words, users of the libxc
"API" (which we do not gurantee between releases - it is an internal

> >> > Hrm, how is that going wrt 4.5 freeze?
> >>
> >> I havenât heard seen anything since v5 of this series (for which I did
> >> some quick bugfixes and released v6).
> >>
> >
> >FYI, thats not entirely true. Yang did post a set of RFC patches for
> >remus
> >support in migration v2, based on your V6 series (back in July)
> >http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg01163.html
> My apologies - it was v6 to v6.1
> >
> >
> >It would actually be helpful if you could cc me on the patches
> >relevant to Remus,
> >or if there is anything specific to Remus that needs to be done. There
> >are 100s of
> >posts on Xen devel every day and its hard to keep track of everything
> >posted to
> >Xen devel.

I've found that putting filters for the right keywords help in that.
That is how I can subscribe to lkml without drinking the

> >
> >
> >And I looking at your patch sets in
> >http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/saverestore2-v6.3
> >
> >I see that there is no support for Remus currently. Nor can I
> >differentiate which parts of the
> >code fix to these "quick bug fixes" that you mentioned above. From the
> >discussion over the remus rfc
> >patches, I only recall a bug related to vcpu context caching. But I
> >cannot delineate that specific part from
> >the patches in the repo. So, if these bug fixes you are referring to
> >are something else, please explain.
> The bugfixes were referring to the vcpu context caching, but far more
> bits needed caching than the remus series fixed.  The fixes were
> necessary even in the non-remus case and there were also improvements to
> receive side state machine to avoid vm corruption caused by an incorrect
> send order.
> I did not integrate the remus specific patches as there were outstanding
> review concerns/comments.

<nods> My recollection as well.
> >
> >
> >> I don't know, which probably means not good.
> >>
> >
> >> >
> >> >> One option might be to have legacy and v2 sitting properly
> >side-by-side
> >> >> in libxc for the transition period.
> >> > How long do you mean? Until 4.6?
> >>
> >
> >fwiw, I don't plan to work on remus migration v2 support until the
> >remus netbuffer patches get in.
> >I have been at this for almost two release cycles. Its frustrating to
> >iterate on feedbacks for patch 4/11
> >of a series for two months and then get a bunch of first-pass review
> >for patch 6/10 at the eleventh hour
> >before a feature freeze, while the rest of the series has still not
> >been reviewed at all for the past 3 months.

What is the dependency on "full remus" support? Is there a list of
all the different patchset that need to be reviewed?

> I can appreciate your frustration on this point, and do not envy your
> position.
> The concern I have is that XenServer 6.5 is shipping with migrationv2 as
> we absolutely need it, given the 32->64bit upgrade.  We were hoping to
> get the new format committed in 4.5 to guarantee stability, but that is
> looking increasingly unlikely to happen.  As a result, it will probably
> have to go in early in 4.6, with extra care taken to ensure that no
> incompatible changes are made as a result of further review.

Could you tell me what are the benefits of having a v1 to v2 runtime
switch for developers/users besides the obvious (faster migration,
easier to understand code)?

For me it sounded that this would allow the community to also
test it and report bugs - which would be invaluable. And better
yet there is a env flag to swap between a baseline and new
code to ease the testing.

The risks seem quite contained - if something goes awry, folks can
use the v1 version - which should have the same amount of bugs
that it had in previous releases. And since it is on by default - so
only dedicated users would turn v2 on.

From an maintaince perspective, it does add more code but then once
feature freeze hits we do not pay attention to features anymore,
but rather to bug-fixes.

Hm, Ian's - what are you folks take on it?

