[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 08/10] libxl: Kill QEMU by uid when possible



George Dunlap writes ("Re: [PATCH v2 08/10] libxl: Kill QEMU by uid when 
possible"):
> On 12/12/18 4:17 PM, Ian Jackson wrote:
> > I am tempted to suggest replacing each call
> >   PROPAGATE_RC;
> > with
> >   ACCUMULATE_RC(ddms);
> > and put the definition in libxl_internal.h for use elsewhere.
> 
> I like that better.  What about `ACCUMULATE_RC(ddms->rc)` instead?  Then
> the same macro could be used for a local variable.

Yes, SGTM.

> >> +    /*
> >> +     * If there was an error here, accumulate the error and fall back
> >> +     * to killing by pid.
> >> +     */
> >> +    if (rc) {
> >> +        PROPAGATE_RC;
> >> +        LOGD(ERROR, domid, "Reading dm UID path failed for %s", path);
> >> +    }
> > 
> > From the comment for libxl__xs_read_checked:
> >  | * On error, *result_out is undefined.
> > Arguably this is a bear trap.  Maybe you would like to fix it there
> > rather than by setting dm_uid_str to 0 here.
> 
> Saying it's "undefined" is probably a bear trap.  But you don't like the
> way it's actually written -- i.e., that the pointer is only modified if
> the value is successfully read?

I think that's a weirdly complicated interface with strange
implications for normal callers.  I dislike threading the logic
through like this.  By "fix it there" I meant change
libxl__xs_read_checked to always set it to 0 on error, and to promise
to do that.  Sorry not to be explicit.

IOW I don't really think libxl__xs_read_checked ought to expect to be
used in an accumulate-y way, especially since it only leaves you with
the previous value on errors you probably weren't expecting, and not
on ENOENT.

> >> +        if (!reaper_pid) {  /* child */
> >> +            rc = kill_device_model_uid_child(ddms, dm_uid_str);
> >> +            _exit(rc);
> >> +        }
> > 
> > You cannot _exit(rc).  See my comments below...
> 
> Is the 'not' attached to "rc" (i.e., the value must be positive), or
> '_exit()' (i.e., you must call exit() rather than _exit())?

To rc.

> >> +static void kill_device_model_uid_cb(libxl__egc *egc,
> >> +                                    libxl__ev_child *destroyer,
> >> +                                    pid_t pid, int status)
> >> +{
> >> +    libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, 
> >> *ddms, destroyer);
> >> +    STATE_AO_GC(ddms->ao);
> >> +
> >> +    if (status) {
> >> +        int rc = ERROR_FAIL;;
> >> +
> >> +        if (WIFEXITED(status))
> >> +            rc = WEXITSTATUS(status) - 128;
> > 
> > This can't be right.  Where does this 128 come from ?
> 
> Looks like I was trying to figure out how WEXITSTATUS worked from
> looking at the child of devices_destroy_cb() put in _exit() and  how
> domain_destroy_domid_cb() interpreted it: Namely, I inferred that
> _exit(-1) would get you WEXITSTATUS(status) == 127, and extrapolated
> that -2 would get you 126, and so on.

But _exit(-1) won't get you WEXITSTATUS(status) == 127.  Also, there
is a manual.  But it will be easier if I state what happens:

 * Only the bottom byte (2's complement) fo the argument to exit()
   or _exit() is used.  This value is called the `exit status'.
   It is a common idiom to write exit(255) as exit(-1).

 * The resulting wait status (what you get from waitpid) is
   exit_status << 8.

 * If your died due to an uncaught signal, the wait status is the
   signal number.  Plus 128 if the program dumped core.

 * HOWEVER a parent may experience other wait statuses:

 * The dynamic linker or other parts of the runtime may sometimes
   cause your program to exit nonconsensually, perhaps before
   it even ran any of your code.  By convention it will (if it
   can) print a message to stderr, and use the exit status
   255 or 127.

 * If a shell is involved, the rules are weirdly broken.  When a
   process run by a shell terminates, the shell squashes either the
   top or bottom byte of the wait status into $?.  That is, $? is now
   either the exit status, or the termination signal, or the
   termination signal plus 128.  Typically if a shell script exits due
   to failure of a program it ran, it will call exit on $?.  So signal
   numbers can end up showing up as exit statuses.

 * Kernels sometimes send processes SIGKILL due to OOM.  And of course
   there are a lot of other signals that might arise for various
   reasons only very loosely within the program's control.

 * I am shy of exit status 126 due to what I think is a well-founded
   superstition that someone may steal it for something.

 * This stuff with the wait status is the same everywhere that's
   actually a dialect of Unix but macros have been provided
      WIFEXITED      if true, you may use
         WEXITSTATUS
      WIFSIGNALED    if true, you may use
         WTERMSIG
         WCOREDUMP   <- missing on some ancient proprietary unices
   and as I am a careful programmer I usually handle the remaining
   case (by printing something like `unexpected wait status %x') too.
   (There are other macros WIFSTOPPED etc. which I am not discussing
   here, which arise if you ask wait* to tell you when your child
   stops not just when it ends.)

> > I suggest one of the following strategies:
> > 
> >   - Give up on the idea of distinguishing these error codes at all and
> >     simply _exit(!!rc).  (After all the real error is logged and the
> >     function only ever returns FAIL.)
> > 
> >   - Say that the child function may only return one of a limited
> >     subset of libxl error codes (since only FAIL is currently needed),
> >     and assert that -125 <= rc <= -1, and _exit(-rc).  Then the exit
> >     status can be recovered with -WEXITSTATUS(status).
> 
> For some reason, I really don't want to drop the exit code.  If you'd
> rather I do #1 I will, but given the choice I'd go with #2.

OK.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.