|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 05/19] libxl: domain save/restore: run in a separate process
Ian Campbell writes ("Re: [Xen-devel] [PATCH 05/19] libxl: domain save/restore:
run in a separate process"):
> On Fri, 2012-06-08 at 18:34 +0100, Ian Jackson wrote:
> > libxenctrl expects to be able to simply run the save or restore
> > operation synchronously. This won't work well in a process which is
> > trying to handle multiple domains.
...
> > diff --git a/.hgignore b/.hgignore
> > index 27d8f79..05304ea 100644
> > ^tools/libxl/tmp\..*$
> > +^tools/libxl/.*\.new$
>
> Our naming scheme for these sorts of temporary files is rather in
> consistent (including an existing use of .new), oh well...
Hmm, yes. I took my cue from libxl_list.h, which is immediately before
_libxl_save_msgs_helper.[ch] in the Makefile.
> > +libxl-save-helper: $(SAVE_HELPER_OBJS) libxenlight.so
> > + $(CC) $(LDFLAGS) -o $@ $(SAVE_HELPER_OBJS) $(LDLIBS_libxenctrl)
> > $(LDLIBS_libxenguest) $(APPEND_LDFLAGS)
>
> The changelog says that libxl-save-helper doesn't link libxenlight but
> it is declared as a dependency here and CFLAGS_libxenlight is included
> in SAVE_HELPER_OBJS' CFLAGS above.
This is a mistake in the Makefile, which I have fixed.
> > testidl: testidl.o libxlutil.so libxenlight.so
> > $(CC) $(LDFLAGS) -o $@ testidl.o libxlutil.so $(LDLIBS_libxenlight)
> > $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> [...]
> > @@ -170,6 +184,7 @@ install: all
> > $(INSTALL_DIR) $(DESTDIR)$(BASH_COMPLETION_DIR)
> > $(INSTALL_DIR) $(DESTDIR)$(XEN_RUN_DIR)
> > $(INSTALL_PROG) xl $(DESTDIR)$(SBINDIR)
> > + $(INSTALL_PROG) libxl-save-helper $(DESTDIR)$(PRIVATE_BINDIR)
>
> This needs an INSTALL_DIR for $(DESTDIR)$(PRIVATE_BINDIR) to support
> "make -C tools/libxl DESTDIR=xxx install" else:
> install: cannot create regular file `/tmp/tmplKa0Y7/usr/lib/xen/bin':
> No such file or directory
OK. I guess I never do that and it's a bit of a mystery why one
would without having done a more general install before, but it's
clearly a correct change.
> > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> > index b48452c..fea0c94 100644
...
> > int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
> > - uint32_t size, void *data)
...
> > + libxl_ctx *ctx = CTX;
>
> Any reason you didn't s/ctx/CTX/ in one of your preparatory patches?
I didn't want to do that unless it was necessary. There was already
enough else going on. I don't think having a ctx rather than using
CTX is wrong, although it is slightly less in the modern idiom.
> > void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state
> > *dss,
> > unsigned long vm_generationid_addr)
> > {
> > STATE_AO_GC(dss->ao);
> > - int r;
> > + int r, rc;
> > + uint32_t toolstack_data_len;
> > +
> > + /* Resources we need to free */
> > + uint8_t *toolstack_data_buf = 0;
> > +
> > + unsigned cbflags = libxl__srm_callout_enumcallbacks_save
> > + (&dss->shs.callbacks.save.a);
> > +
> > + r = libxl__toolstack_save(dss->domid, &toolstack_data_buf,
> > + &toolstack_data_len, dss);
>
> This seems to be called twice? I'm looking at the source after the full
> series is applied and I see
> dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save;
> in libxl__domain_suspend as well as this call here.
But in fact dss->shs.callbacks.save.toolstack_save is never used
anywhere in that version. The bug is that libxl__xc_domain_save
should call the callback function (if it is non-null), not call
libxl__toolstack_save directly.
> The callback one seems to be otherwise unused.
Exactly.
I have fixed this:
* libxl__xc_domain_save uses supplied callback function pointer,
rather than calling libxl__toolstack_save directly;
toolstack data save callback is only supplied to libxc if
in-libxl caller supplied a callback.
> > +static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> > + const char *mode_arg, int data_fd,
> > + const unsigned long *argnums, int num_argnums)
> > +{
...
> > + libxl__carefd_begin();
> > + for (i=0; i<2; i++) {
> > + int fds[2];
> > + if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
> > + childs_pipes[i] = libxl__carefd_record(CTX, fds[i]);
> > + shs->pipes[i] = libxl__carefd_record(CTX, fds[!i]);
>
> Using !i here is clever, but I had to deploy a pen to be sure the
> assignments were all correct.
>
> Open coding this simple loop would also give you the opportunity the
> have comments like "/* This is the helper processes's stdout */" etc in
> the appropriate place to aid in comprehension when checking the correct
> end of each pipe goes in the right place.
Perhaps the right answer is to use a better variable name than `i'.
I wrote it out longhand and it looked like this:
libxl__carefd_begin();
int fds[2];
/* child's stdin */
if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
childs_pipes[0] = libxl__carefd_record(CTX, fds[0]);
shs->pipes[0] = libxl__carefd_record(CTX, fds[1]);
/* child's stdout */
if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
childs_pipes[1] = libxl__carefd_record(CTX, fds[1]);
shs->pipes[1] = libxl__carefd_record(CTX, fds[0]);
libxl__carefd_unlock();
which is a great way to obscure the subtle differences between the two
stanzas. How about:
libxl__carefd_begin();
int childfd;
for (childfd=0; childfd<2; childfd++) {
/* Setting up the pipe for the child's fd childfd */
int fds[2];
if (libxl_pipe(CTX,fds)) { rc = ERROR_FAIL; goto out; }
int childs_end = childfd==0 ? 0 /*read*/ : 1 /*write*/;
int our_end = childfd==0 ? 1 /*write*/ : 0 /*read*/;
childs_pipes[childfd] = libxl__carefd_record(CTX, fds[childs_end]);
shs->pipes[childfd] = libxl__carefd_record(CTX, fds[our_end]);
}
libxl__carefd_unlock();
?
> > +static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
> > + int fd, short events, short revents)
> > +{
> > + libxl__save_helper_state *shs = CONTAINER_OF(ev, *shs, readable);
> > + STATE_AO_GC(shs->ao);
> > + int rc, errnoval;
> > +
> > + if (revents & POLLPRI) {
> > + LOG(ERROR, "%s signaled POLLERR", shs->stdout_what);
>
> signalled ?
>
> (it's not impossible that my spell checker is wrong, google seem to
> think both are ok)
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/kill.html
uses "signaled".
> The if says POLLPRI but the log says POLLERR?
This should check and log both.
> > + rc = ERROR_FAIL;
> > + out:
> > + /* this is here because otherwise we bypass the decl of msg[] */
>
> I don't get this comment, why can't "out:" be in the normal place after
> the non-error return?
Because it is not legal to "goto" within the scope of a variable whose
size is not a constant. The alternative would be to introduce a block
scope starting before `unsigned char msg[msglen]' and ending after
`return'.
Arguably msg[msglen] is asking too much (up to 64K) of the caller's
stack. Should I change it ?
> > + if (!shs->completed) {
> > + if (!shs->rc)
> > + LOG(ERROR,"%s exited without signaling completion",what);
>
> signalling again, same caveat
I'll go with what SuS says.
> > --- /dev/null
> > +++ b/tools/libxl/libxl_save_helper.c
> > @@ -0,0 +1,279 @@
> > +/*
> > + * Copyright (C) 2012 Citrix Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU Lesser General Public License as published
> > + * by the Free Software Foundation; version 2.1 only. with the special
> > + * exception on linking described in file LICENSE.
>
> LGPL, or perhaps GPL since this is a helper?
>
> (I suppose the same argument could be made for xl itself)
I can see no reason to deviate from the licence of the rest of libxl.
If nothing else, code may move between the helper and libxl proper.
> > +int helper_getreply(void *user)
> > +{
> > + int v;
> > + int r = read_exactly(0, &v, sizeof(v));
> > + if (r<=0) exit(-2);
>
> You use -1 a lot but here you use -2, are we supposed to be able to
> infer something from the specific error code?
Not really. I used -1 for general system call failures. This
particular situation might include a protocol error by the parent,
though, so I thought I'd distinguish it. That way if you get a
message saying the helper exited with a nonzero exit status foobar you
get slightly more information.
I could make this more formal and be more consistent with these exit
statuses, or alternatively I could abolish it. I don't think it's
critical - nothing turns on this exit status.
> > diff --git a/tools/libxl/libxl_save_msgs_gen.pl
> > b/tools/libxl/libxl_save_msgs_gen.pl
> > new file mode 100755
> > index 0000000..cd0837e
> > --- /dev/null
> > +++ b/tools/libxl/libxl_save_msgs_gen.pl
> > @@ -0,0 +1,393 @@
> > +#!/usr/bin/perl -w
> > +
> > +use warnings;
> > +use strict;
> > +use POSIX;
> > +
> > +our $debug = 0; # produce copius debugging output at run-time?
>
> copious
>
> (which is probably the most useful feedback you are going to get from me
> on a Perl script ;-))
Fixed.
> > +foreach my $ah (qw(callout helper)) {
> > + $out_body{$ah} .= <<END.($ah eq 'callout' ? <<END : <<END);
> [...]
> > +END
> [...]
> > +END
> [...]
> > +END
>
> I think I can infer what this does, but wow ;-)
Following a suggestion from Mark Wooding I have changed this to:
<<END_BOTH.($ah eq 'callout' ? <<END_CALLOUT : <<END_HELPER);
#include "libxl_osdeps.h"
#include <assert.h>
#include <string.h>
#include <stdint.h>
#include <limits.h>
END_BOTH
#include "libxl_internal.h"
END_CALLOUT
#include "_libxl_save_msgs_${ah}.h"
#include <xenctrl.h>
#include <xenguest.h>
END_HELPER
which I think is much clearer.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |