Xen project Mailing List

Re: [Xen-devel] [PATCH v2 08/10] libxl: Kill QEMU by uid when possible

To: Ian Jackson <ian.jackson@xxxxxxxxxx>

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Mon, 17 Dec 2018 18:09:55 +0000

Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABtCRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT6JAkAEEwEKACoCGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEFAlpk2IEFCQo9I54ACgkQpjY8MQWQtG1A1BAAnc0oX3+M/jyv4j/ESJTO U2JhuWUWV6NFuzU10pUmMqpgQtiVEVU2QbCvTcZS1U/S6bqAUoiWQreDMSSgGH3a3BmRNi8n HKtarJqyK81aERM2HrjYkC1ZlRYG+jS8oWzzQrCQiTwn3eFLJrHjqowTbwahoiMw/nJ+OrZO /VXLfNeaxA5GF6emwgbpshwaUtESQ/MC5hFAFmUBZKAxp9CXG2ZhTP6ROV4fwhpnHaz8z+BT NQz8YwA4gkmFJbDUA9I0Cm9D/EZscrCGMeaVvcyldbMhWS+aH8nbqv6brhgbJEQS22eKCZDD J/ng5ea25QnS0fqu3bMrH39tDqeh7rVnt8Yu/YgOwc3XmgzmAhIDyzSinYEWJ1FkOVpIbGl9 uR6seRsfJmUK84KCScjkBhMKTOixWgNEQ/zTcLUsfTh6KQdLTn083Q5aFxWOIal2hiy9UyqR VQydowXy4Xx58rqvZjuYzdGDdAUlZ+D2O3Jp28ez5SikA/ZaaoGI9S1VWvQsQdzNfD2D+xfL qfd9yv7gko9eTJzv5zFr2MedtRb/nCrMTnvLkwNX4abB5+19JGneeRU4jy7yDYAhUXcI/waS /hHioT9MOjMh+DoLCgeZJYaOcgQdORY/IclLiLq4yFnG+4Ocft8igp79dbYYHkAkmC9te/2x Kq9nEd0Hg288EO+5AQ0EVFpnOgEIAM6XPDYOTqW64Yma5+vV6947NvKfm+GvtATrwuPDX6za L2cOHhXiiM5iP7ehJCZEqgSMaG1kaQZMBsHhDbKp3dKooJrA8ODeyfV8dIfQEQ6olsV+I6+7 vcWriPgkSdawTTt1Vd9EHQAsEOC6oUf1gPiI3YcjB8I9xCRhOtTXT/4dM32i2AG7xIOO/0z0 4RbJuJvEXem1+0ZK6zoAWy/wDp2DjBIr8n2WSl9b74hHpgLy33ZNpWbe1Zul/32ym1fLT1Lm RC8zXnSb00wUt/5dRVc/TlHCw3loRhHZcalx9LGFoRPfj10wH8+ScSh/izHrcBDPA27jqAyK ZiBmSq2ftn0AEQEAAYkDRAQYAQoADwIbAgUCWmTW+QUJB+ujPwEpwF0gBBkBCgAGBQJUWmc6 AAoJELIVx6fHhBvtxesIALSpB4RaYtr2gQA9r7lTrC8bW3+aLbaBk3q7NBcfV9og6gN6Gvs8 8RITq25H+8gJNOdpKt3hQM816o6pUXTth7FYPUsNxAbo+dGoLkMhfVEYTcFpJoyXakUk/zL5 yF7CzXXI/wYMFvFoixNwdkjWJUgL1cuGh56BaLzi9hzwXjOIANV+jBuZu9xXDXWATy2YAsLB N4F5lW15eOHQ4QsfCtzX/iPjK8Q2MhdE75AsiCTjeQHntSmvi0/YwRyzSh2A8z5D6gRM4nTT HMuCROcs+KYLUUhbZs5l1OP5Srp7NFLYsqw2Zb49FG83IDmiMRsD99rGYCMxm0t1JJJ4UrzL hKgJEKY2PDEFkLRtji8P/RTPQdWZmdN29QhJ92ws/IuYmEOrwlAmvQGZWxADe+9VIoQeQaSA e/i8yuC9nbPJhl5DyrbmOv9A3EnAXvxyt1c1jpznWg3m0xuB214G7iN5l5g71tOajy9ZhId8 HKRwnmefRcT153tE0Kfw1ILgpslhUasrGuuICsMUAeNPCgdT3siIXDTD5kY/M0m7sHYdM+Ik DzK4vYhB89lZY4k87SrNEAs2YRu8nub27iRB+mb+qjSRWCVlQ1OWQ8gq2BmSoNch1zF3ukB0 KHIclPZ9EI8JpQ6qVbP6RkNPf7AdtIZrI+5eIjsVNvqhCXfaXxfB4fwHmMcbMT5f3s6CFH3M TVm/j7CpXCt8PQOZIWlDrdRhW9ywFPcKWwfUI37WAbHxJI4tzZAUytHi0TlpcQpPHXbbw10s ME4mbMuOlW/Rt01sc2d5SuZkG2/rw7E4TBq6VA3ZbSztvA6ZW6IZX/oX9dFyhw28gHG7+yRw WSNLkCgnO2rXhPJTNfOAn4bdBcQ8Adb9QbWdtqt0xpe6/NjAWGJMBmvXMiiDAKcyS3o8EXK2 CKtRdNjWisu3q/6KPQup7UxP1fMQ0dN9qGz6Cuw1tBKaTDRLS80c8i0WEHcHDSkEIx63sny1 GhyT0XIEmJfhdw99RvEh5S3CkxYnUpHay6KaHJgNKL5L2+oxzpIWA1S6uQENBFRaur0BCADt onSLWlBKZRHpldkPZgQPGJrYHJHS5mhNLs3Q1i/U6NTy/qnTXu7QVyjn5CiO799n3tJweGnn EZUCTmTFkEUNPii8l3Sch5KvdttbB83MbHXBrO193Ne3qfcwEqvsCGKgHWb6+6TfWt51R2eF u283s7jQwL5+BKTn/6NEbFjcg5U+ihArNQ7sznUag6DjCX2JrcfYTM6gaE3a+lNtPyoJwv3Z llnCQFGV2gBaftzWEQpJO5Pd/VWlKaGOdfQni68pnVXZHuuigolgUFzJILTBrxpOYC0C8uB9 yl76V6A62CoMrMu43jnHMSPKMKIjnbW3zPE0w8lj0WII82/SwKQPABEBAAGJAiUEGAEKAA8C GwwFAlpk1zMFCQfrT/YACgkQpjY8MQWQtG2/tg//YY59ZOVnER5btfVhrh+qtCoJtS0U+z55 0s/dOIoBzRJTAeWu8EY8OZHTcFN7EZtp55h3jiR/JGI9h59UIF+UqkLMrFkx1jhLHhnqF8nc fc2WZLd6ECTPvTVdVYytGzl8KoYkMhFFs+f/ZeOuxUv5OBSeQhzUbpr4S2tJdhxBLuacauOt x0GRw7eGBP/WO+Hlzp2AgeJ62MUA/xklxGb1q8hFq3g6Ghas6tUyrcx4RYEBu8hVBHqcS0VF LWLBKU+kZLNpeCwqht4VQ9FERSIk8rsScd1Qtk2uCx94cULYmiKbl6qtg+M+t4erwsdsMX2X P1kRxm6+DQJQfNZd+UP1B8jKHFbmC49JZRdK8FOAI4imealjUhHbxKS+N3072WMUIQwo0Eym 29/KJruT+JDn9R0+7PpJkCkbYiwZah8ytew+Cv9fNAA8O2t4J5q+UbpnGT9zRkkmQOoz+bza kKTbuIKqzxVjUCkHFvBwYmBYKukqC0EFm0cSQx700WCdprO6AnvO9IIeA9cBRaky3sl4lao3 XRDRjWj/GZQg8OhFPNjfAZ+S1yo0dRlqNlCtwo65B6U7d2GGb64UtjDthGBHFo8ruiwCxf5U us+iynkGfrfQHUFHCC5a8fSMal7+hrwKASyWNY4xgavv5ET61l6aGkJ+xV1hnzKlPjZGPXp8 q5c=

Cc: Anthony Perard <anthony.perard@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wei.liu2@xxxxxxxxxx>

Delivery-date: Mon, 17 Dec 2018 18:10:24 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Openpgp: preference=signencrypt

On 12/12/18 4:17 PM, Ian Jackson wrote: > George Dunlap writes ("[PATCH v2 08/10] libxl: Kill QEMU by uid when > possible"): >> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to >> one specific domain ID. This domain ID will probably eventually be >> used again. It is therefore necessary to make absolutely sure that a >> rogue QEMU process cannot hang around after its domain has exited. > > Thanks. Detailed comments mostly on error handling follow... > >> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h >> index f9e0bf6578..cd3208f4b8 100644 >> --- a/tools/libxl/libxl_internal.h >> +++ b/tools/libxl/libxl_internal.h >> @@ -1135,7 +1135,7 @@ typedef struct { >> const char *shim_cmdline; >> const char *pv_cmdline; >> >> - char *dm_runas; >> + char *dm_runas, *dm_uid; > > I think `dm_uid' is misnamed. It is only set if the dm has a > *dedicated* uid. If the dm is run as uid 0 or as a shared qemu uid, > it is set to NULL. dm_kill_uid? If not some suggestions would be helpful. I'll add a comment here describing these as well. >> @@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc >> *gc, >> LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s", >> user); >> rc = ERROR_INVAL; >> - } else >> + } else { >> intended_uid = user_base->pw_uid; >> + kill_by_uid = true; >> + } >> >> goto out; >> } > > I think your changes to the out block newly imply that all `goto out' > with `rc=0' must also set kill_by_uid. Yes; I didn't initialize kill_by_uid in the hopes (perhaps overly optimistic) that the compiler would notice if there were paths where it wasn't explicitly set and complain. >> + /* >> + * If we're starting the dm with a non-root UID, save the UID so >> + * that we can reliably kill it and any subprocesses >> + */ >> + if (state->dm_uid) >> + libxl__xs_printf(gc, XBT_NULL, >> + GCSPRINTF("%s/image/device-model-uid", dom_path), > > My comment about the misnamed libxl variable applies to > device-model-uid too I think. Ack >> +#define PROPAGATE_RC if(!ddms->rc) ddms->rc = rc > ^ > > I like this macro. But (i) it needs proper macro hygiene - either a > do{ }while(0) block, or refactoring into an expression (and then > putting in parens). And (ii) you missed out a space. [snip] >> +#undef PROPAGATE_RC > > I am tempted to suggest replacing each call > PROPAGATE_RC; > with > ACCUMULATE_RC(ddms); > and put the definition in libxl_internal.h for use elsewhere. I like that better. What about `ACCUMULATE_RC(ddms->rc)` instead? Then the same macro could be used for a local variable. [snip] >> + /* >> + * See if we should try to kill by uid >> + */ >> + path = GCSPRINTF("/local/domain/%d/image/device-model-uid", domid); >> + rc = libxl__xs_read_checked(gc, XBT_NULL, path, &dm_uid_str); >> + >> + /* >> + * If there was an error here, accumulate the error and fall back >> + * to killing by pid. >> + */ >> + if (rc) { >> + PROPAGATE_RC; >> + LOGD(ERROR, domid, "Reading dm UID path failed for %s", path); >> + } > > From the comment for libxl__xs_read_checked: > | * On error, *result_out is undefined. > Arguably this is a bear trap. Maybe you would like to fix it there > rather than by setting dm_uid_str to 0 here. Saying it's "undefined" is probably a bear trap. But you don't like the way it's actually written -- i.e., that the pointer is only modified if the value is successfully read? >> + reaper_pid = libxl__ev_child_fork(gc, &ddms->destroyer, >> + kill_device_model_uid_cb); >> + if (reaper_pid < 0) { >> + rc = ERROR_FAIL; >> + PROPAGATE_RC; >> + /* >> + * Note that if this fails, we still don't kill by pid, to >> + * make sure that an untrusted DM has not "maliciously" >> + * exited (potentially causing us to kill an unrelated >> + * process which happened to get the same pid). >> + */ >> + goto out; >> + } >> + >> + if (!reaper_pid) { /* child */ >> + rc = kill_device_model_uid_child(ddms, dm_uid_str); >> + _exit(rc); >> + } > > You cannot _exit(rc). See my comments below... Is the 'not' attached to "rc" (i.e., the value must be positive), or '_exit()' (i.e., you must call exit() rather than _exit())? > > Personally I like to put the child process block right after the fork, > especially when (like here) it is very short because the meat has been > lifted elsewhere. Just a suggestion; you can leave it here if you > prefer. OK -- I think this is long enough that I prefer it separate. > >> - /* We should try to destroy the device model anyway. */ >> - rc = kill_device_model(gc, >> - GCSPRINTF("/local/domain/%d/image/device-model-pid", domid)); >> + /* >> + * No uid to kill; attept to kill by pid. >> + */ >> + LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid"); >> + >> + path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid); >> + rc = kill_device_model(gc, path); >> + >> + if (rc) { >> + PROPAGATE_RC; >> + LOGD(ERROR, domid, "Killing device model pid from path %s", path); >> + } >> + >> +out: > > I would prefer the apparently-redundant: > > + rc = 0; >> +out: OK -- if I don't set rc initially it won't be redundant. :-) >> + /* >> + * NB that we always return '0' here for the "status of exited >> + * process"; since there is no process, it always "succeeds". > > I had to read this three times to figure out what this meant. Perhaps > I'm being dense but would you mind writing > > + * NB that we always pass '0' here for the "status of exited > ^^^^ Ack >> +static void kill_device_model_uid_cb(libxl__egc *egc, >> + libxl__ev_child *destroyer, >> + pid_t pid, int status) >> +{ >> + libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, >> destroyer); >> + STATE_AO_GC(ddms->ao); >> + >> + if (status) { >> + int rc = ERROR_FAIL;; >> + >> + if (WIFEXITED(status)) >> + rc = WEXITSTATUS(status) - 128; > > This can't be right. Where does this 128 come from ? Looks like I was trying to figure out how WEXITSTATUS worked from looking at the child of devices_destroy_cb() put in _exit() and how domain_destroy_domid_cb() interpreted it: Namely, I inferred that _exit(-1) would get you WEXITSTATUS(status) == 127, and extrapolated that -2 would get you 126, and so on. > I don't see a comment anywhere about your encoding of the libxl error > value in the exit status. > > A libxl error code does not necessarily fit in an exit status since an > exit status is just one byte. Your protocol reserves the exit status > 0 for success. The C implementation typically reserves -1 (255) and > sometimes also 127. By convention the exit status is normally > regarded as positive and libxl error codes are negative. > > I suggest one of the following strategies: > > - Give up on the idea of distinguishing these error codes at all and > simply _exit(!!rc). (After all the real error is logged and the > function only ever returns FAIL.) > > - Say that the child function may only return one of a limited > subset of libxl error codes (since only FAIL is currently needed), > and assert that -125 <= rc <= -1, and _exit(-rc). Then the exit > status can be recovered with -WEXITSTATUS(status). For some reason, I really don't want to drop the exit code. If you'd rather I do #1 I will, but given the choice I'd go with #2. Thanks, -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.