[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 08/10] libxl: Kill QEMU by uid when possible
On 12/12/18 4:17 PM, Ian Jackson wrote: > George Dunlap writes ("[PATCH v2 08/10] libxl: Kill QEMU by uid when > possible"): >> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to >> one specific domain ID. This domain ID will probably eventually be >> used again. It is therefore necessary to make absolutely sure that a >> rogue QEMU process cannot hang around after its domain has exited. > > Thanks. Detailed comments mostly on error handling follow... > >> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h >> index f9e0bf6578..cd3208f4b8 100644 >> --- a/tools/libxl/libxl_internal.h >> +++ b/tools/libxl/libxl_internal.h >> @@ -1135,7 +1135,7 @@ typedef struct { >> const char *shim_cmdline; >> const char *pv_cmdline; >> >> - char *dm_runas; >> + char *dm_runas, *dm_uid; > > I think `dm_uid' is misnamed. It is only set if the dm has a > *dedicated* uid. If the dm is run as uid 0 or as a shared qemu uid, > it is set to NULL. dm_kill_uid? If not some suggestions would be helpful. I'll add a comment here describing these as well. >> @@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc >> *gc, >> LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s", >> user); >> rc = ERROR_INVAL; >> - } else >> + } else { >> intended_uid = user_base->pw_uid; >> + kill_by_uid = true; >> + } >> >> goto out; >> } > > I think your changes to the out block newly imply that all `goto out' > with `rc=0' must also set kill_by_uid. Yes; I didn't initialize kill_by_uid in the hopes (perhaps overly optimistic) that the compiler would notice if there were paths where it wasn't explicitly set and complain. >> + /* >> + * If we're starting the dm with a non-root UID, save the UID so >> + * that we can reliably kill it and any subprocesses >> + */ >> + if (state->dm_uid) >> + libxl__xs_printf(gc, XBT_NULL, >> + GCSPRINTF("%s/image/device-model-uid", dom_path), > > My comment about the misnamed libxl variable applies to > device-model-uid too I think. Ack >> +#define PROPAGATE_RC if(!ddms->rc) ddms->rc = rc > ^ > > I like this macro. But (i) it needs proper macro hygiene - either a > do{ }while(0) block, or refactoring into an expression (and then > putting in parens). And (ii) you missed out a space. [snip] >> +#undef PROPAGATE_RC > > I am tempted to suggest replacing each call > PROPAGATE_RC; > with > ACCUMULATE_RC(ddms); > and put the definition in libxl_internal.h for use elsewhere. I like that better. What about `ACCUMULATE_RC(ddms->rc)` instead? Then the same macro could be used for a local variable. [snip] >> + /* >> + * See if we should try to kill by uid >> + */ >> + path = GCSPRINTF("/local/domain/%d/image/device-model-uid", domid); >> + rc = libxl__xs_read_checked(gc, XBT_NULL, path, &dm_uid_str); >> + >> + /* >> + * If there was an error here, accumulate the error and fall back >> + * to killing by pid. >> + */ >> + if (rc) { >> + PROPAGATE_RC; >> + LOGD(ERROR, domid, "Reading dm UID path failed for %s", path); >> + } > > From the comment for libxl__xs_read_checked: > | * On error, *result_out is undefined. > Arguably this is a bear trap. Maybe you would like to fix it there > rather than by setting dm_uid_str to 0 here. Saying it's "undefined" is probably a bear trap. But you don't like the way it's actually written -- i.e., that the pointer is only modified if the value is successfully read? >> + reaper_pid = libxl__ev_child_fork(gc, &ddms->destroyer, >> + kill_device_model_uid_cb); >> + if (reaper_pid < 0) { >> + rc = ERROR_FAIL; >> + PROPAGATE_RC; >> + /* >> + * Note that if this fails, we still don't kill by pid, to >> + * make sure that an untrusted DM has not "maliciously" >> + * exited (potentially causing us to kill an unrelated >> + * process which happened to get the same pid). >> + */ >> + goto out; >> + } >> + >> + if (!reaper_pid) { /* child */ >> + rc = kill_device_model_uid_child(ddms, dm_uid_str); >> + _exit(rc); >> + } > > You cannot _exit(rc). See my comments below... Is the 'not' attached to "rc" (i.e., the value must be positive), or '_exit()' (i.e., you must call exit() rather than _exit())? > > Personally I like to put the child process block right after the fork, > especially when (like here) it is very short because the meat has been > lifted elsewhere. Just a suggestion; you can leave it here if you > prefer. OK -- I think this is long enough that I prefer it separate. > >> - /* We should try to destroy the device model anyway. */ >> - rc = kill_device_model(gc, >> - GCSPRINTF("/local/domain/%d/image/device-model-pid", domid)); >> + /* >> + * No uid to kill; attept to kill by pid. >> + */ >> + LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid"); >> + >> + path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid); >> + rc = kill_device_model(gc, path); >> + >> + if (rc) { >> + PROPAGATE_RC; >> + LOGD(ERROR, domid, "Killing device model pid from path %s", path); >> + } >> + >> +out: > > I would prefer the apparently-redundant: > > + rc = 0; >> +out: OK -- if I don't set rc initially it won't be redundant. :-) >> + /* >> + * NB that we always return '0' here for the "status of exited >> + * process"; since there is no process, it always "succeeds". > > I had to read this three times to figure out what this meant. Perhaps > I'm being dense but would you mind writing > > + * NB that we always pass '0' here for the "status of exited > ^^^^ Ack >> +static void kill_device_model_uid_cb(libxl__egc *egc, >> + libxl__ev_child *destroyer, >> + pid_t pid, int status) >> +{ >> + libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, >> destroyer); >> + STATE_AO_GC(ddms->ao); >> + >> + if (status) { >> + int rc = ERROR_FAIL;; >> + >> + if (WIFEXITED(status)) >> + rc = WEXITSTATUS(status) - 128; > > This can't be right. Where does this 128 come from ? Looks like I was trying to figure out how WEXITSTATUS worked from looking at the child of devices_destroy_cb() put in _exit() and how domain_destroy_domid_cb() interpreted it: Namely, I inferred that _exit(-1) would get you WEXITSTATUS(status) == 127, and extrapolated that -2 would get you 126, and so on. > I don't see a comment anywhere about your encoding of the libxl error > value in the exit status. > > A libxl error code does not necessarily fit in an exit status since an > exit status is just one byte. Your protocol reserves the exit status > 0 for success. The C implementation typically reserves -1 (255) and > sometimes also 127. By convention the exit status is normally > regarded as positive and libxl error codes are negative. > > I suggest one of the following strategies: > > - Give up on the idea of distinguishing these error codes at all and > simply _exit(!!rc). (After all the real error is logged and the > function only ever returns FAIL.) > > - Say that the child function may only return one of a limited > subset of libxl error codes (since only FAIL is currently needed), > and assert that -125 <= rc <= -1, and _exit(-rc). Then the exit > status can be recovered with -WEXITSTATUS(status). For some reason, I really don't want to drop the exit code. If you'd rather I do #1 I will, but given the choice I'd go with #2. Thanks, -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |