[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xl: Always use "fast" migration resume protocol
On Mon, 2014-01-13 at 18:15 +0000, Ian Jackson wrote: > As Ian Campbell writes: "...in http://bugs.xenproject.org/xen/bug/30" would be useful here (can add on commit, no need to resend just for this IMHO) > There are two mechanisms by which a suspend can be aborted and the > original domain resumed. > > The older method is that the toolstack resets a bunch of state (see > tools/python/xen/xend/XendDomainInfo.py resumeDomain) and then > restarts the domain. The domain will see HYPERVISOR_suspend return 0 > and will continue without any realisation that it is actually > running in the original domain and not in a new one. This method is > supposed to be implemented by libxl_domain_resume(suspend_cancel=0) > but it is not. > > The other method is newer and in this case the toolstack arranges > that HYPERVISOR_suspend returns SUSPEND_CANCEL and restarts it. The > domain will observe this and realise that it has been restarted in > the same domain and will behave accordingly. This method is > implemented, correctly AFAIK, by > libxl_domain_resume(suspend_cancel=1). > > Attempting to use the old method without doing all of the work simply > causes the guest to crash. Implementing the work required for old > method, or for checking that domains actually support the new method, > is not feasible at this stage of the 4.4 release. > > So, always use the new method, without regard to the declarations of > support by the guest. This is a strict improvement: guests which do > in fact support the new method will work, whereas ones which don't are > no worse off. I agree with this rationale. > There are two call sites of libxl_domain_resume that need fixing, both > in the migration error path. > > With this change I observe a correct and successful resumption of a > Debian wheezy guest with a Linux 3.4.70 kernel after a migration > attempt which I arranged to fail by nobbling the block hotplug script. > > Signed-off-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> Acked-by: Ian Campbell <Ian.Campbell@xxxxxxxxxx> I think you have at least a partial patch ready for 4.5? > CC: konrad.wilk@xxxxxxxxxx > CC: David Vrabel <david.vrabel@xxxxxxxxxx> > CC: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> > --- > tools/libxl/xl_cmdimpl.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c > index c30f495..d93e01b 100644 > --- a/tools/libxl/xl_cmdimpl.c > +++ b/tools/libxl/xl_cmdimpl.c > @@ -3734,7 +3734,7 @@ static void migrate_domain(uint32_t domid, const char > *rune, int debug, > if (common_domname) { > libxl_domain_rename(ctx, domid, away_domname, common_domname); > } > - rc = libxl_domain_resume(ctx, domid, 0, 0); > + rc = libxl_domain_resume(ctx, domid, 1, 0); > if (!rc) fprintf(stderr, "migration sender: Resumed OK.\n"); > > fprintf(stderr, "Migration failed due to problems at target.\n"); > @@ -3756,7 +3756,7 @@ static void migrate_domain(uint32_t domid, const char > *rune, int debug, > close(send_fd); > migration_child_report(recv_fd); > fprintf(stderr, "Migration failed, resuming at sender.\n"); > - libxl_domain_resume(ctx, domid, 0, 0); > + libxl_domain_resume(ctx, domid, 1, 0); > exit(-ERROR_FAIL); > > failed_badly: _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |