[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 08/10] libxl: Kill QEMU by uid when possible


  • To: Ian Jackson <ian.jackson@xxxxxxxxxx>
  • From: George Dunlap <george.dunlap@xxxxxxxxxx>
  • Date: Mon, 17 Dec 2018 18:09:55 +0000
  • Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABtCRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT6JAkAEEwEKACoCGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEFAlpk2IEFCQo9I54ACgkQpjY8MQWQtG1A1BAAnc0oX3+M/jyv4j/ESJTO U2JhuWUWV6NFuzU10pUmMqpgQtiVEVU2QbCvTcZS1U/S6bqAUoiWQreDMSSgGH3a3BmRNi8n HKtarJqyK81aERM2HrjYkC1ZlRYG+jS8oWzzQrCQiTwn3eFLJrHjqowTbwahoiMw/nJ+OrZO /VXLfNeaxA5GF6emwgbpshwaUtESQ/MC5hFAFmUBZKAxp9CXG2ZhTP6ROV4fwhpnHaz8z+BT NQz8YwA4gkmFJbDUA9I0Cm9D/EZscrCGMeaVvcyldbMhWS+aH8nbqv6brhgbJEQS22eKCZDD J/ng5ea25QnS0fqu3bMrH39tDqeh7rVnt8Yu/YgOwc3XmgzmAhIDyzSinYEWJ1FkOVpIbGl9 uR6seRsfJmUK84KCScjkBhMKTOixWgNEQ/zTcLUsfTh6KQdLTn083Q5aFxWOIal2hiy9UyqR VQydowXy4Xx58rqvZjuYzdGDdAUlZ+D2O3Jp28ez5SikA/ZaaoGI9S1VWvQsQdzNfD2D+xfL qfd9yv7gko9eTJzv5zFr2MedtRb/nCrMTnvLkwNX4abB5+19JGneeRU4jy7yDYAhUXcI/waS /hHioT9MOjMh+DoLCgeZJYaOcgQdORY/IclLiLq4yFnG+4Ocft8igp79dbYYHkAkmC9te/2x Kq9nEd0Hg288EO+5AQ0EVFpnOgEIAM6XPDYOTqW64Yma5+vV6947NvKfm+GvtATrwuPDX6za L2cOHhXiiM5iP7ehJCZEqgSMaG1kaQZMBsHhDbKp3dKooJrA8ODeyfV8dIfQEQ6olsV+I6+7 vcWriPgkSdawTTt1Vd9EHQAsEOC6oUf1gPiI3YcjB8I9xCRhOtTXT/4dM32i2AG7xIOO/0z0 4RbJuJvEXem1+0ZK6zoAWy/wDp2DjBIr8n2WSl9b74hHpgLy33ZNpWbe1Zul/32ym1fLT1Lm RC8zXnSb00wUt/5dRVc/TlHCw3loRhHZcalx9LGFoRPfj10wH8+ScSh/izHrcBDPA27jqAyK ZiBmSq2ftn0AEQEAAYkDRAQYAQoADwIbAgUCWmTW+QUJB+ujPwEpwF0gBBkBCgAGBQJUWmc6 AAoJELIVx6fHhBvtxesIALSpB4RaYtr2gQA9r7lTrC8bW3+aLbaBk3q7NBcfV9og6gN6Gvs8 8RITq25H+8gJNOdpKt3hQM816o6pUXTth7FYPUsNxAbo+dGoLkMhfVEYTcFpJoyXakUk/zL5 yF7CzXXI/wYMFvFoixNwdkjWJUgL1cuGh56BaLzi9hzwXjOIANV+jBuZu9xXDXWATy2YAsLB N4F5lW15eOHQ4QsfCtzX/iPjK8Q2MhdE75AsiCTjeQHntSmvi0/YwRyzSh2A8z5D6gRM4nTT HMuCROcs+KYLUUhbZs5l1OP5Srp7NFLYsqw2Zb49FG83IDmiMRsD99rGYCMxm0t1JJJ4UrzL hKgJEKY2PDEFkLRtji8P/RTPQdWZmdN29QhJ92ws/IuYmEOrwlAmvQGZWxADe+9VIoQeQaSA e/i8yuC9nbPJhl5DyrbmOv9A3EnAXvxyt1c1jpznWg3m0xuB214G7iN5l5g71tOajy9ZhId8 HKRwnmefRcT153tE0Kfw1ILgpslhUasrGuuICsMUAeNPCgdT3siIXDTD5kY/M0m7sHYdM+Ik DzK4vYhB89lZY4k87SrNEAs2YRu8nub27iRB+mb+qjSRWCVlQ1OWQ8gq2BmSoNch1zF3ukB0 KHIclPZ9EI8JpQ6qVbP6RkNPf7AdtIZrI+5eIjsVNvqhCXfaXxfB4fwHmMcbMT5f3s6CFH3M TVm/j7CpXCt8PQOZIWlDrdRhW9ywFPcKWwfUI37WAbHxJI4tzZAUytHi0TlpcQpPHXbbw10s ME4mbMuOlW/Rt01sc2d5SuZkG2/rw7E4TBq6VA3ZbSztvA6ZW6IZX/oX9dFyhw28gHG7+yRw WSNLkCgnO2rXhPJTNfOAn4bdBcQ8Adb9QbWdtqt0xpe6/NjAWGJMBmvXMiiDAKcyS3o8EXK2 CKtRdNjWisu3q/6KPQup7UxP1fMQ0dN9qGz6Cuw1tBKaTDRLS80c8i0WEHcHDSkEIx63sny1 GhyT0XIEmJfhdw99RvEh5S3CkxYnUpHay6KaHJgNKL5L2+oxzpIWA1S6uQENBFRaur0BCADt onSLWlBKZRHpldkPZgQPGJrYHJHS5mhNLs3Q1i/U6NTy/qnTXu7QVyjn5CiO799n3tJweGnn EZUCTmTFkEUNPii8l3Sch5KvdttbB83MbHXBrO193Ne3qfcwEqvsCGKgHWb6+6TfWt51R2eF u283s7jQwL5+BKTn/6NEbFjcg5U+ihArNQ7sznUag6DjCX2JrcfYTM6gaE3a+lNtPyoJwv3Z llnCQFGV2gBaftzWEQpJO5Pd/VWlKaGOdfQni68pnVXZHuuigolgUFzJILTBrxpOYC0C8uB9 yl76V6A62CoMrMu43jnHMSPKMKIjnbW3zPE0w8lj0WII82/SwKQPABEBAAGJAiUEGAEKAA8C GwwFAlpk1zMFCQfrT/YACgkQpjY8MQWQtG2/tg//YY59ZOVnER5btfVhrh+qtCoJtS0U+z55 0s/dOIoBzRJTAeWu8EY8OZHTcFN7EZtp55h3jiR/JGI9h59UIF+UqkLMrFkx1jhLHhnqF8nc fc2WZLd6ECTPvTVdVYytGzl8KoYkMhFFs+f/ZeOuxUv5OBSeQhzUbpr4S2tJdhxBLuacauOt x0GRw7eGBP/WO+Hlzp2AgeJ62MUA/xklxGb1q8hFq3g6Ghas6tUyrcx4RYEBu8hVBHqcS0VF LWLBKU+kZLNpeCwqht4VQ9FERSIk8rsScd1Qtk2uCx94cULYmiKbl6qtg+M+t4erwsdsMX2X P1kRxm6+DQJQfNZd+UP1B8jKHFbmC49JZRdK8FOAI4imealjUhHbxKS+N3072WMUIQwo0Eym 29/KJruT+JDn9R0+7PpJkCkbYiwZah8ytew+Cv9fNAA8O2t4J5q+UbpnGT9zRkkmQOoz+bza kKTbuIKqzxVjUCkHFvBwYmBYKukqC0EFm0cSQx700WCdprO6AnvO9IIeA9cBRaky3sl4lao3 XRDRjWj/GZQg8OhFPNjfAZ+S1yo0dRlqNlCtwo65B6U7d2GGb64UtjDthGBHFo8ruiwCxf5U us+iynkGfrfQHUFHCC5a8fSMal7+hrwKASyWNY4xgavv5ET61l6aGkJ+xV1hnzKlPjZGPXp8 q5c=
  • Cc: Anthony Perard <anthony.perard@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Wei Liu <wei.liu2@xxxxxxxxxx>
  • Delivery-date: Mon, 17 Dec 2018 18:10:24 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 12/12/18 4:17 PM, Ian Jackson wrote:
> George Dunlap writes ("[PATCH v2 08/10] libxl: Kill QEMU by uid when 
> possible"):
>> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
>> one specific domain ID.  This domain ID will probably eventually be
>> used again.  It is therefore necessary to make absolutely sure that a
>> rogue QEMU process cannot hang around after its domain has exited.
> 
> Thanks.  Detailed comments mostly on error handling follow...
> 
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index f9e0bf6578..cd3208f4b8 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -1135,7 +1135,7 @@ typedef struct {
>>      const char *shim_cmdline;
>>      const char *pv_cmdline;
>>  
>> -    char *dm_runas;
>> +    char *dm_runas, *dm_uid;
> 
> I think `dm_uid' is misnamed.  It is only set if the dm has a
> *dedicated* uid.  If the dm is run as uid 0 or as a shared qemu uid,
> it is set to NULL.
dm_kill_uid?  If not some suggestions would be helpful.

I'll add a comment here describing these as well.

>> @@ -148,8 +150,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
>> *gc,
>>              LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
>>                   user);
>>              rc = ERROR_INVAL;
>> -        } else
>> +        } else {
>>              intended_uid = user_base->pw_uid;
>> +            kill_by_uid = true;
>> +        }
>>  
>>          goto out;
>>      }
> 
> I think your changes to the out block newly imply that all `goto out'
> with `rc=0' must also set kill_by_uid.

Yes; I didn't initialize kill_by_uid in the hopes (perhaps overly
optimistic) that the compiler would notice if there were paths where it
wasn't explicitly set and complain.

>> +    /*
>> +     * If we're starting the dm with a non-root UID, save the UID so
>> +     * that we can reliably kill it and any subprocesses
>> +     */
>> +    if (state->dm_uid)
>> +        libxl__xs_printf(gc, XBT_NULL,
>> +                         GCSPRINTF("%s/image/device-model-uid", dom_path),
> 
> My comment about the misnamed libxl variable applies to
> device-model-uid too I think.

Ack

>> +#define PROPAGATE_RC if(!ddms->rc) ddms->rc = rc
>                           ^
> 
> I like this macro.  But (i) it needs proper macro hygiene - either a
> do{ }while(0) block, or refactoring into an expression (and then
> putting in parens).  And (ii) you missed out a space.
[snip]
>> +#undef PROPAGATE_RC
>
> I am tempted to suggest replacing each call
>   PROPAGATE_RC;
> with
>   ACCUMULATE_RC(ddms);
> and put the definition in libxl_internal.h for use elsewhere.

I like that better.  What about `ACCUMULATE_RC(ddms->rc)` instead?  Then
the same macro could be used for a local variable.

[snip]

>> +    /*
>> +     * See if we should try to kill by uid
>> +     */
>> +    path = GCSPRINTF("/local/domain/%d/image/device-model-uid", domid);
>> +    rc = libxl__xs_read_checked(gc, XBT_NULL, path, &dm_uid_str);
>> +
>> +    /*
>> +     * If there was an error here, accumulate the error and fall back
>> +     * to killing by pid.
>> +     */
>> +    if (rc) {
>> +        PROPAGATE_RC;
>> +        LOGD(ERROR, domid, "Reading dm UID path failed for %s", path);
>> +    }
> 
> From the comment for libxl__xs_read_checked:
>  | * On error, *result_out is undefined.
> Arguably this is a bear trap.  Maybe you would like to fix it there
> rather than by setting dm_uid_str to 0 here.

Saying it's "undefined" is probably a bear trap.  But you don't like the
way it's actually written -- i.e., that the pointer is only modified if
the value is successfully read?

>> +        reaper_pid = libxl__ev_child_fork(gc, &ddms->destroyer,
>> +                                          kill_device_model_uid_cb);
>> +        if (reaper_pid < 0) {
>> +            rc = ERROR_FAIL;
>> +            PROPAGATE_RC;
>> +            /*
>> +             * Note that if this fails, we still don't kill by pid, to
>> +             * make sure that an untrusted DM has not "maliciously"
>> +             * exited (potentially causing us to kill an unrelated
>> +             * process which happened to get the same pid).
>> +             */
>> +            goto out;
>> +        }
>> +
>> +        if (!reaper_pid) {  /* child */
>> +            rc = kill_device_model_uid_child(ddms, dm_uid_str);
>> +            _exit(rc);
>> +        }
> 
> You cannot _exit(rc).  See my comments below...

Is the 'not' attached to "rc" (i.e., the value must be positive), or
'_exit()' (i.e., you must call exit() rather than _exit())?

> 
> Personally I like to put the child process block right after the fork,
> especially when (like here) it is very short because the meat has been
> lifted elsewhere.  Just a suggestion; you can leave it here if you
> prefer.

OK -- I think this is long enough that I prefer it separate.

> 
>> -    /* We should try to destroy the device model anyway. */
>> -    rc = kill_device_model(gc,
>> -              GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
>> +    /*
>> +     * No uid to kill; attept to kill by pid.
>> +     */
>> +    LOGD(DEBUG, domid, "Didn't find dm UID; destroying by pid");
>> +
>> +    path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid);
>> +    rc = kill_device_model(gc, path);
>> +
>> +    if (rc) {
>> +        PROPAGATE_RC;
>> +        LOGD(ERROR, domid, "Killing device model pid from path %s", path);
>> +    }
>> +
>> +out:
> 
> I would prefer the apparently-redundant:
> 
>   +    rc = 0;
>> +out:

OK -- if I don't set rc initially it won't be redundant. :-)


>> +    /*
>> +     * NB that we always return '0' here for the "status of exited
>> +     * process"; since there is no process, it always "succeeds".
> 
> I had to read this three times to figure out what this meant.  Perhaps
> I'm being dense but would you mind writing
> 
>   +     * NB that we always pass '0' here for the "status of exited
>                             ^^^^

Ack

>> +static void kill_device_model_uid_cb(libxl__egc *egc,
>> +                                    libxl__ev_child *destroyer,
>> +                                    pid_t pid, int status)
>> +{
>> +    libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, 
>> destroyer);
>> +    STATE_AO_GC(ddms->ao);
>> +
>> +    if (status) {
>> +        int rc = ERROR_FAIL;;
>> +
>> +        if (WIFEXITED(status))
>> +            rc = WEXITSTATUS(status) - 128;
> 
> This can't be right.  Where does this 128 come from ?

Looks like I was trying to figure out how WEXITSTATUS worked from
looking at the child of devices_destroy_cb() put in _exit() and  how
domain_destroy_domid_cb() interpreted it: Namely, I inferred that
_exit(-1) would get you WEXITSTATUS(status) == 127, and extrapolated
that -2 would get you 126, and so on.

> I don't see a comment anywhere about your encoding of the libxl error
> value in the exit status.
> 
> A libxl error code does not necessarily fit in an exit status since an
> exit status is just one byte.  Your protocol reserves the exit status
> 0 for success.  The C implementation typically reserves -1 (255) and
> sometimes also 127.  By convention the exit status is normally
> regarded as positive and libxl error codes are negative.
> 
> I suggest one of the following strategies:
> 
>   - Give up on the idea of distinguishing these error codes at all and
>     simply _exit(!!rc).  (After all the real error is logged and the
>     function only ever returns FAIL.)
> 
>   - Say that the child function may only return one of a limited
>     subset of libxl error codes (since only FAIL is currently needed),
>     and assert that -125 <= rc <= -1, and _exit(-rc).  Then the exit
>     status can be recovered with -WEXITSTATUS(status).

For some reason, I really don't want to drop the exit code.  If you'd
rather I do #1 I will, but given the choice I'd go with #2.

Thanks,
 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.