|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 05/19] libxl: domain save/restore: run in a separate process
On Fri, 2012-06-08 at 18:34 +0100, Ian Jackson wrote:
> libxenctrl expects to be able to simply run the save or restore
> operation synchronously. This won't work well in a process which is
> trying to handle multiple domains.
>
> The options are:
>
> - Block such a whole process (eg, the whole of libvirt) while
> migration completes (or until it fails).
>
> - Create a thread to run xc_domain_save and xc_domain_restore on.
> This is quite unpalatable. Multithreaded programming is error
> prone enough without generating threads in libraries, particularly
> if the thread does some very complex operation.
>
> - Fork and run the operation in the child without execing. This is
> no good because we would need to negotiate with the caller about
> fds we would inherit (and we might be a very large process).
>
> - Fork and exec a helper.
>
> Of these options the latter is the most palatable.
>
> Consequently:
>
> * A new helper program libxl-save-helper (which does both save and
> restore). It will be installed in /usr/lib/xen/bin. It does not
> link against libxl, only libxc, and its error handling does not
> need to be very advanced. It does contain a plumbing through of
> the logging interface into the callback stream.
>
> * A small ad-hoc protocol between the helper and libxl which allows
> log messages and the libxc callbacks to be passed up and down.
> Protocol doc comment is in libxl_save_helper.c.
>
> * To avoid a lot of tedium the marshalling boilerplate (stubs for the
> helper and the callback decoder for libxl) is generated with a
> small perl script.
>
> * Implement new functionality to spawn the helper, monitor its
> output, provide responses, and check on its exit status.
>
> * The functions libxl__xc_domain_restore_done and
> libxl__xc_domain_save_done now turn out to want be called in the
> same place. So make their state argument a void* so that the two
> functions are type compatible.
>
> The domain save path still writes the qemu savefile synchronously.
> This will need to be fixed in a subsequent patch.
>
> Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
>
> Changes in v3:
> * Suppress errno value in debug message when helper reports successful
> completion.
> * Significant consequential changes to cope with changes to
> earlier patches in the series.
>
> Changes in v2:
> * Helper path can be overridden by an environment variable for testing.
> * Add a couple of debug logging messages re toolstack data.
> * Fixes from testing.
> * Helper protocol message lengths (and numbers) are 16-bit which
> more clearly avoids piling lots of junk on the stack.
> * Merged with remus changes.
> * Callback implementations in libxl now called via pointers
> so remus can have its own callbacks.
> * Better namespace prefixes on autogenerated names etc.
> * Autogenerator can generate debugging printfs too.
> ---
[...]
> diff --git a/.hgignore b/.hgignore
> index 27d8f79..05304ea 100644
> --- a/.hgignore
> +++ b/.hgignore
> @@ -180,9 +180,11 @@
> ^tools/libxl/_.*\.c$
> ^tools/libxl/libxlu_cfg_y\.output$
> ^tools/libxl/xl$
> +^tools/libxl/libxl-save-helper$
> ^tools/libxl/testidl$
> ^tools/libxl/testidl\.c$
> ^tools/libxl/tmp\..*$
> +^tools/libxl/.*\.new$
Our naming scheme for these sorts of temporary files is rather in
consistent (including an existing use of .new), oh well...
[...]
> +SAVE_HELPER_OBJS = libxl_save_helper.o _libxl_save_msgs_helper.o
> +$(SAVE_HELPER_OBJS): CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenlight)
> +
> testidl.o: CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenlight)
> testidl.c: libxl_types.idl gentest.py libxl.h $(AUTOINCS)
> $(PYTHON) gentest.py libxl_types.idl testidl.c.new
> @@ -117,6 +122,12 @@ _libxl_list.h:
> $(XEN_INCLUDE)/xen-external/bsd-sys-queue-h-seddery $(XEN_INCLUDE
> perl $^ --prefix=libxl >$@.new
> $(call move-if-changed,$@.new,$@)
>
> +_libxl_save_msgs_helper.c _libxl_save_msgs_callout.c \
> +_libxl_save_msgs_helper.h _libxl_save_msgs_callout.h: \
> + libxl_save_msgs_gen.pl
> + $(PERL) -w $< $@ >$@.new
> + $(call move-if-changed,$@.new,$@)
> +
> libxl.h: _libxl_types.h
> libxl_json.h: _libxl_types_json.h
> libxl_internal.h: _libxl_types_internal.h _paths.h
> @@ -159,6 +170,9 @@ libxlutil.a: $(LIBXLU_OBJS)
> xl: $(XL_OBJS) libxlutil.so libxenlight.so
> $(CC) $(LDFLAGS) -o $@ $(XL_OBJS) libxlutil.so $(LDLIBS_libxenlight)
> $(LDLIBS_libxenctrl) -lyajl $(APPEND_LDFLAGS)
>
> +libxl-save-helper: $(SAVE_HELPER_OBJS) libxenlight.so
> + $(CC) $(LDFLAGS) -o $@ $(SAVE_HELPER_OBJS) $(LDLIBS_libxenctrl)
> $(LDLIBS_libxenguest) $(APPEND_LDFLAGS)
The changelog says that libxl-save-helper doesn't link libxenlight but
it is declared as a dependency here and CFLAGS_libxenlight is included
in SAVE_HELPER_OBJS' CFLAGS above.
> +
> testidl: testidl.o libxlutil.so libxenlight.so
> $(CC) $(LDFLAGS) -o $@ testidl.o libxlutil.so $(LDLIBS_libxenlight)
> $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
[...]
> @@ -170,6 +184,7 @@ install: all
> $(INSTALL_DIR) $(DESTDIR)$(BASH_COMPLETION_DIR)
> $(INSTALL_DIR) $(DESTDIR)$(XEN_RUN_DIR)
> $(INSTALL_PROG) xl $(DESTDIR)$(SBINDIR)
> + $(INSTALL_PROG) libxl-save-helper $(DESTDIR)$(PRIVATE_BINDIR)
This needs an INSTALL_DIR for $(DESTDIR)$(PRIVATE_BINDIR) to support
"make -C tools/libxl DESTDIR=xxx install" else:
install: cannot create regular file `/tmp/tmplKa0Y7/usr/lib/xen/bin':
No such file or directory
> $(INSTALL_PROG) libxenlight.so.$(MAJOR).$(MINOR) $(DESTDIR)$(LIBDIR)
> ln -sf libxenlight.so.$(MAJOR).$(MINOR)
> $(DESTDIR)$(LIBDIR)/libxenlight.so.$(MAJOR)
> ln -sf libxenlight.so.$(MAJOR) $(DESTDIR)$(LIBDIR)/libxenlight.so
>
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index b48452c..fea0c94 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -465,16 +465,20 @@ static inline char *restore_helper(libxl__gc *gc,
> uint32_t domid,
> }
>
> int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
> - uint32_t size, void *data)
> + uint32_t size, void *user)
> {
> - libxl__gc *gc = data;
> - libxl_ctx *ctx = gc->owner;
> + libxl__save_helper_state *shs = user;
> + libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
> + STATE_AO_GC(dcs->ao);
> + libxl_ctx *ctx = CTX;
Any reason you didn't s/ctx/CTX/ in one of your preparatory patches?
> int i, ret;
> const uint8_t *ptr = buf;
> uint32_t count = 0, version = 0;
> struct libxl__physmap_info* pi;
> char *xs_path;
[...]
> diff --git a/tools/libxl/libxl_save_callout.c
> b/tools/libxl/libxl_save_callout.c
> index 1b481ab..a39f434 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -16,6 +16,19 @@
>
> #include "libxl_internal.h"
>
> +static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> + const char *mode_arg, int data_fd,
> + const unsigned long *argnums, int num_argnums);
> +
> +static void helper_failed(libxl__egc*, libxl__save_helper_state *shs, int
> rc);
> +static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
> + int fd, short events, short revents);
> +static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
> + pid_t pid, int status);
> +static void helper_done(libxl__egc *egc, libxl__save_helper_state *shs);
> +
> +/*----- entrypoints -----*/
> +
> void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state
> *dcs,
> int hvm, int pae, int superpages,
> int no_incr_generationid)
> @@ -27,22 +40,319 @@ void libxl__xc_domain_restore(libxl__egc *egc,
> libxl__domain_create_state *dcs,
> const int restore_fd = dcs->restore_fd;
> libxl__domain_build_state *const state = &dcs->build_state;
>
> - int r = xc_domain_restore(CTX->xch, restore_fd, domid,
> - state->store_port, &state->store_mfn,
> - state->store_domid, state->console_port,
> - &state->console_mfn, state->console_domid,
> - hvm, pae, superpages, no_incr_generationid,
> - &state->vm_generationid_addr, &dcs->callbacks);
> - libxl__xc_domain_restore_done(egc, dcs, 0, r, errno);
> + unsigned cbflags = libxl__srm_callout_enumcallbacks_restore
> + (&dcs->shs.callbacks.restore.a);
> +
> + const unsigned long argnums[] = {
> + restore_fd, domid,
> + state->store_port,
> + state->store_domid, state->console_port,
> + state->console_domid,
> + hvm, pae, superpages, no_incr_generationid,
> + cbflags,
> + };
> +
> + dcs->shs.ao = ao;
> + dcs->shs.domid = domid;
> + dcs->shs.recv_callback = libxl__srm_callout_received_restore;
> + dcs->shs.completion_callback = libxl__xc_domain_restore_done;
> + dcs->shs.caller_state = dcs;
> + dcs->shs.need_results = 1;
> + dcs->shs.toolstack_data_file = 0;
> +
> + run_helper(egc, &dcs->shs, "--restore-domain", restore_fd,
> + argnums, ARRAY_SIZE(argnums));
> }
>
> void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss,
> unsigned long vm_generationid_addr)
> {
> STATE_AO_GC(dss->ao);
> - int r;
> + int r, rc;
> + uint32_t toolstack_data_len;
> +
> + /* Resources we need to free */
> + uint8_t *toolstack_data_buf = 0;
> +
> + unsigned cbflags = libxl__srm_callout_enumcallbacks_save
> + (&dss->shs.callbacks.save.a);
> +
> + r = libxl__toolstack_save(dss->domid, &toolstack_data_buf,
> + &toolstack_data_len, dss);
This seems to be called twice? I'm looking at the source after the full
series is applied and I see
dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save;
in libxl__domain_suspend as well as this call here.
The callback one seems to be otherwise unused.
> +static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> + const char *mode_arg, int data_fd,
> + const unsigned long *argnums, int num_argnums)
> +{
> + STATE_AO_GC(shs->ao);
> + const char *args[3 + num_argnums];
> + const char **arg = args;
> + int i, rc;
> +
> + /* Resources we must free */
> + libxl__carefd *childs_pipes[2] = { 0,0 };
> +
> + /* Convenience aliases */
> + const uint32_t domid = shs->domid;
> +
> + shs->rc = 0;
> + shs->completed = 0;
> + shs->pipes[0] = shs->pipes[1] = 0;
> + libxl__ev_fd_init(&shs->readable);
> + libxl__ev_child_init(&shs->child);
> +
> + shs->stdin_what = GCSPRINTF("domain %"PRIu32" save/restore helper"
> + " stdin pipe", domid);
> + shs->stdout_what = GCSPRINTF("domain %"PRIu32" save/restore helper"
> + " stdout pipe", domid);
> +
> + *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC "/" "libxl-save-helper";
> + *arg++ = mode_arg;
> + for (i=0; i<num_argnums; i++)
> + *arg++ = GCSPRINTF("%lu", argnums[i]);
> + *arg++ = 0;
> + assert(arg == args + ARRAY_SIZE(args));
> +
> + libxl__carefd_begin();
> + for (i=0; i<2; i++) {
> + int fds[2];
> + if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
> + childs_pipes[i] = libxl__carefd_record(CTX, fds[i]);
> + shs->pipes[i] = libxl__carefd_record(CTX, fds[!i]);
Using !i here is clever, but I had to deploy a pen to be sure the
assignments were all correct.
Open coding this simple loop would also give you the opportunity the
have comments like "/* This is the helper processes's stdout */" etc in
the appropriate place to aid in comprehension when checking the correct
end of each pipe goes in the right place.
> + }
> + libxl__carefd_unlock();
> +
> + pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
> + if (!pid) {
> + libxl_fd_set_cloexec(CTX, data_fd, 0);
> + libxl__exec(gc,
> + libxl__carefd_fd(childs_pipes[0]),
> + libxl__carefd_fd(childs_pipes[1]),
> + -1,
> + args[0], (char**)args, 0);
> + }
> +
> + libxl__carefd_close(childs_pipes[0]);
> + libxl__carefd_close(childs_pipes[1]);
> +
> + rc = libxl__ev_fd_register(gc, &shs->readable, helper_stdout_readable,
> + libxl__carefd_fd(shs->pipes[1]),
> POLLIN|POLLPRI);
> + if (rc) goto out;
> + return;
> +
> + out:
> + libxl__carefd_close(childs_pipes[0]);
> + libxl__carefd_close(childs_pipes[1]);
> + helper_failed(egc, shs, rc);;
> +}
> +
> +static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
> + int rc)
> +{
> + STATE_AO_GC(shs->ao);
> +
> + if (!shs->rc)
> + shs->rc = rc;
> +
> + libxl__ev_fd_deregister(gc, &shs->readable);
> +
> + if (!libxl__ev_child_inuse(&shs->child)) {
> + helper_done(egc, shs);
> + return;
> + }
> +
> + int r = kill(shs->child.pid, SIGKILL);
> + if (r) LOGE(WARN, "failed to kill save/restore helper [%lu]",
> + (unsigned long)shs->child.pid);
> +}
> +
> +static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
> + int fd, short events, short revents)
> +{
> + libxl__save_helper_state *shs = CONTAINER_OF(ev, *shs, readable);
> + STATE_AO_GC(shs->ao);
> + int rc, errnoval;
> +
> + if (revents & POLLPRI) {
> + LOG(ERROR, "%s signaled POLLERR", shs->stdout_what);
signalled ?
(it's not impossible that my spell checker is wrong, google seem to
think both are ok)
The if says POLLPRI but the log says POLLERR?
> + rc = ERROR_FAIL;
> + out:
> + /* this is here because otherwise we bypass the decl of msg[] */
I don't get this comment, why can't "out:" be in the normal place after
the non-error return?
> + helper_failed(egc, shs, rc);
> + return;
> + }
> +
> + uint16_t msglen;
> + errnoval = libxl_read_exactly(CTX, fd, &msglen, sizeof(msglen),
> + shs->stdout_what, "ipc msg header");
> + if (errnoval) { rc = ERROR_FAIL; goto out; }
> +
> + unsigned char msg[msglen];
> + errnoval = libxl_read_exactly(CTX, fd, msg, msglen,
> + shs->stdout_what, "ipc msg body");
> + if (errnoval) { rc = ERROR_FAIL; goto out; }
> +
> + shs->egc = egc;
> + shs->recv_callback(msg, msglen, shs);
> + shs->egc = 0;
> + return;
> +}
> +
> +static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
> + pid_t pid, int status)
> +{
> + libxl__save_helper_state *shs = CONTAINER_OF(ch, *shs, child);
> + STATE_AO_GC(shs->ao);
> +
> + /* Convenience aliases */
> + const uint32_t domid = shs->domid;
> +
> + const char *what =
> + GCSPRINTF("domain %"PRIu32" save/restore helper", domid);
> +
> + if (status) {
> + libxl_report_child_exitstatus(CTX, XTL_ERROR, what, pid, status);
> + shs->rc = ERROR_FAIL;
> + }
> +
> + if (shs->need_results) {
> + if (!shs->rc)
> + LOG(ERROR,"%s exited without providing results",what);
> + shs->rc = ERROR_FAIL;
> + }
> +
> + if (!shs->completed) {
> + if (!shs->rc)
> + LOG(ERROR,"%s exited without signaling completion",what);
signalling again, same caveat
> + shs->rc = ERROR_FAIL;
> + }
> +
> + helper_done(egc, shs);
> + return;
> +}
[...]
> +/*----- generic helpers for the autogenerated code -----*/
[...]
> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
> new file mode 100644
> index 0000000..d29f1f7
> --- /dev/null
> +++ b/tools/libxl/libxl_save_helper.c
> @@ -0,0 +1,279 @@
> +/*
> + * Copyright (C) 2012 Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
LGPL, or perhaps GPL since this is a helper?
(I suppose the same argument could be made for xl itself)
[...]
> +/*----- helper functions called by autogenerated stubs -----*/
> +
> +unsigned char * helper_allocbuf(int len, void *user)
> +{
> + return xmalloc(len);
> +}
> +
> +static void transmit(const unsigned char *msg, int len, void *user)
> +{
> + while (len) {
> + int r = write(1, msg, len);
> + if (r<0) { perror("write"); exit(-1); }
> + assert(r >= 0);
> + assert(r <= len);
> + len -= r;
> + msg += r;
> + }
> +}
> +
> +void helper_transmitmsg(unsigned char *msg_freed, int len_in, void *user)
> +{
> + assert(len_in < 64*1024);
> + uint16_t len = len_in;
> + transmit((const void*)&len, sizeof(len), user);
> + transmit(msg_freed, len, user);
> + free(msg_freed);
> +}
> +
> +int helper_getreply(void *user)
> +{
> + int v;
> + int r = read_exactly(0, &v, sizeof(v));
> + if (r<=0) exit(-2);
You use -1 a lot but here you use -2, are we supposed to be able to
infer something from the specific error code?
> + return v;
> +}
> +
> +/*----- other callbacks -----*/
> +
> +static int toolstack_save_fd;
> +static uint32_t toolstack_save_len;
[...]
> diff --git a/tools/libxl/libxl_save_msgs_gen.pl
> b/tools/libxl/libxl_save_msgs_gen.pl
> new file mode 100755
> index 0000000..cd0837e
> --- /dev/null
> +++ b/tools/libxl/libxl_save_msgs_gen.pl
> @@ -0,0 +1,393 @@
> +#!/usr/bin/perl -w
> +
> +use warnings;
> +use strict;
> +use POSIX;
> +
> +our $debug = 0; # produce copius debugging output at run-time?
copious
(which is probably the most useful feedback you are going to get from me
on a Perl script ;-))
> +
> +our @msgs = (
> + # flags:
> + # s - applicable to save
> + # r - applicable to restore
> + # c - function pointer in callbacks struct rather than fixed function
> + # x - function pointer is in struct {save,restore}_callbacks
> + # and its null-ness needs to be passed through to the helper's xc
> + # W - needs a return value; callback is synchronous
> + [ 1, 'sr', "log", [qw(uint32_t level
> + uint32_t errnoval
> + STRING context
> + STRING formatted)] ],
> + [ 2, 'sr', "progress", [qw(STRING context
> + STRING doing_what),
> + 'unsigned long', 'done',
> + 'unsigned long', 'total'] ],
> + [ 3, 'scxW', "suspend", [] ],
> + [ 4, 'scxW', "postcopy", [] ],
> + [ 5, 'scxW', "checkpoint", [] ],
> + [ 6, 'scxW', "switch_qemu_logdirty", [qw(int domid
> + unsigned enable)] ],
> + # toolstack_save done entirely `by hand'
> + [ 7, 'rcxW', "toolstack_restore", [qw(uint32_t domid
> + BLOCK tsdata)] ],
> + [ 8, 'r', "restore_results", ['unsigned long', 'store_mfn',
> + 'unsigned long', 'console_mfn',
> + 'unsigned long', 'genidad'] ],
> + [ 9, 'srW', "complete", [qw(int retval
> + int errnoval)] ],
> +);
> +
> +#----------------------------------------
> +
> +our %cbs;
> +our %func;
> +our %func_ah;
> +our @outfuncs;
> +our %out_decls;
> +our %out_body;
> +our %msgnum_used;
> +
> +die unless @ARGV==1;
> +die if $ARGV[0] =~ m/^-/;
> +
> +our ($intendedout) = @ARGV;
> +
> +$intendedout =~ m/([a-z]+)\.([ch])$/ or die;
> +my ($want_ah, $ch) = ($1, $2);
> +
> +my $declprefix = '';
> +
> +foreach my $ah (qw(callout helper)) {
> + $out_body{$ah} .= <<END.($ah eq 'callout' ? <<END : <<END);
[...]
> +END
[...]
> +END
[...]
> +END
I think I can infer what this does, but wow ;-)
> +}
[...]
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |