[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] xl: Always use "fast" migration resume protocol



As Ian Campbell writes:

  There are two mechanisms by which a suspend can be aborted and the
  original domain resumed.

  The older method is that the toolstack resets a bunch of state (see
  tools/python/xen/xend/XendDomainInfo.py resumeDomain) and then
  restarts the domain. The domain will see HYPERVISOR_suspend return 0
  and will continue without any realisation that it is actually
  running in the original domain and not in a new one. This method is
  supposed to be implemented by libxl_domain_resume(suspend_cancel=0)
  but it is not.

  The other method is newer and in this case the toolstack arranges
  that HYPERVISOR_suspend returns SUSPEND_CANCEL and restarts it. The
  domain will observe this and realise that it has been restarted in
  the same domain and will behave accordingly. This method is
  implemented, correctly AFAIK, by
  libxl_domain_resume(suspend_cancel=1).

Attempting to use the old method without doing all of the work simply
causes the guest to crash.  Implementing the work required for old
method, or for checking that domains actually support the new method,
is not feasible at this stage of the 4.4 release.

So, always use the new method, without regard to the declarations of
support by the guest.  This is a strict improvement: guests which do
in fact support the new method will work, whereas ones which don't are
no worse off.

There are two call sites of libxl_domain_resume that need fixing, both
in the migration error path.

With this change I observe a correct and successful resumption of a
Debian wheezy guest with a Linux 3.4.70 kernel after a migration
attempt which I arranged to fail by nobbling the block hotplug script.

Signed-off-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
CC: konrad.wilk@xxxxxxxxxx
CC: David Vrabel <david.vrabel@xxxxxxxxxx>
CC: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
---
 tools/libxl/xl_cmdimpl.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c30f495..d93e01b 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -3734,7 +3734,7 @@ static void migrate_domain(uint32_t domid, const char 
*rune, int debug,
         if (common_domname) {
             libxl_domain_rename(ctx, domid, away_domname, common_domname);
         }
-        rc = libxl_domain_resume(ctx, domid, 0, 0);
+        rc = libxl_domain_resume(ctx, domid, 1, 0);
         if (!rc) fprintf(stderr, "migration sender: Resumed OK.\n");
 
         fprintf(stderr, "Migration failed due to problems at target.\n");
@@ -3756,7 +3756,7 @@ static void migrate_domain(uint32_t domid, const char 
*rune, int debug,
     close(send_fd);
     migration_child_report(recv_fd);
     fprintf(stderr, "Migration failed, resuming at sender.\n");
-    libxl_domain_resume(ctx, domid, 0, 0);
+    libxl_domain_resume(ctx, domid, 1, 0);
     exit(-ERROR_FAIL);
 
  failed_badly:
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.