[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17?] x86/paging: return -EINVAL for paging domctls for dying domains


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Edwin Torok <edvin.torok@xxxxxxxxxx>
  • Date: Wed, 9 Nov 2022 09:45:35 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uxuk2aMjE1vhe8d9094QAw9rM5DpkWAwyl3vhl/gFVc=; b=ZbnEsfke8gf3xhwFwOeMJ73wgO2P1YOA+EKiRTC7KA2k0m7MG1c2iqmhgxgpjFHcikvdA38rwsMq5emXxgU+4BVqRCgIrJlcBBMqIK1UNdmgaW5o7w/kZ53pR/XX0V7FmcAo/EYbXUf/+lASdFZ6Z2XQbtDoBQ/rko9FyJd82jusU9yQ02PnLL1klElCPKN+n1eQzMpGFYplqSkwlABBJINTtYefftEPJf9EaAG1FXNIs55iRjyvAMZJU+xRcAJRXNpgg7JthX5UYYg2q/MJQFJ0M409k2qyYTOitO3rBfzmSrg/NGzEQ6NR7kHPpNIh8acUOsTWmcfW6NXewhOK8w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kcsEP4MXi9bJOk1owLMpefX9LXOHNXh83HQsVxAt3JZDurJ2ISAgrVMFBQdHuJE2FoIj9r05qayW2zMoEBt/ukpxXnBy1AJ/euqgMMU+gl4LEmBEQdF5MtjPcuFx3vsC0PgsT2ekbfNdWdDCK5BGGE2E3AlxchDNbokAtf/HfRCWvUWGzuDpsAtwXbAyX7sUtdd9w4YbcJG24Cy9RbvN0+p9POZ5c4D6310S0G8EcQdi8vWdPPrSiw6kIq6mXQp+bR05h63qKaQslOSstnxb361KZ3ohKb8zq9aEHoHiUJ1/7bIOp20sjs/HXZu2HS/3Vu3tVw+gRHyLgVPz7slnlQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, "Henry.Wang@xxxxxxx" <Henry.Wang@xxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 09 Nov 2022 09:45:51 +0000
  • Ironport-data: A9a23:0qv/KqvqWp1g/Wh/5vkgyOoia+fnVGtfMUV32f8akzHdYApBsoF/q tZmKW7UO6uLNDCjKdp/PN/g/U8BupDTmNJjGlNs+ys9RCsU+JbJXdiXEBz9bniYRiHhoOCLz O1FM4Wdc5pkJpP4jk3wWlQ0hSAkjclkfpKlVKiffHg0HVU/IMsYoUoLs/YjhYJ1isSODQqIu Nfjy+XSI1bg0DNvWo4uw/vrRChH4bKj5lv0gnRkPaoR5QaGxiFPZH4iDfrZw0XQE9E88tGSH 44v/JnhlkvF8hEkDM+Sk7qTWiXmlZaLYGBiIlIPM0STqkAqSh4ai87XB9JFAatjsB2bnsgZ9 Tl4ncfYpTHFnEH7sL91vxFwS0mSNEDdkVPNCSDXXce7lyUqf5ZwqhnH4Y5f0YAwo45K7W9yG fMwcg83fCrZp/mNzrP4Q8lPt8k7dOblM9ZK0p1g5Wmx4fcOZ7nmGvyPzvgBmTA6i4ZJAOrUY NcfZXx3dhPcbhZTO1ARTpUjgOOvgXq5eDpdwL6XjfNvvy6Pk0osjv60b4G9lt+iHK25mm6Vq nzH+SLlBQsdN/SUyCaf82LqjejK9c/+cNJOTODkqKYz6LGV7mstVwUVTGeWncaaulTvUthRL 2Ix0BN7+MDe82TuFLERRSaQonSJoxodUNp4CPAh5UeGza+8yyaUAHIVCAFIbtMOvdUzAzct0 zehgNfBFTFp9rqPRhq15rqS6D+/JyURBWsDfjMfCxsI5cH5p4M+hQ6JScxseJNZlfXwEDD0h juN9S43guxKidZRjvrgu1fanziru57FCBYv4RnaVX6k6QU/Y5O5Y4uv6h7Q6vMowJulc2Rtd UMsw6C2hN3ix7nX/MBRaI3hxI2U2ss=
  • Ironport-hdrordr: A9a23:rLqHV6tB73rNEauzEn0BJNCO7skC1YMji2hC6mlwRA09TyXGra 2TdaUgvyMc1gx7ZJh5o6H6BEGBKUmslqKceeEqTPqftXrdyRGVxeZZnMffKlzbamfDH4tmuZ uIHJIOb+EYYWIasS++2njBLz9C+qjJzEnLv5a5854Fd2gDBM9dBkVCe3+m+yZNNWt77O8CZf 6hD7181l+dkBosDviTNz0gZazuttfLnJXpbVotHBg88jSDijuu9frTDwWY9g12aUIP/Z4StU z+1yDp7KSqtP+2jjXG0XXI0phQkNz9jvNeGc23jNQPIDmEsHfpWG0hYczAgNkGmpDr1L8Yqq iJn/7mBbU115rlRBD2nfIq4Xin7N9h0Q669bbSuwqfnSWwfkNHNyMGv/MWTvKR0TtfgPhslK 1MxG6XrJxREFfJmzn8/cHBU1VwmlOzumdKq59bs5TOObFuF4O5gLZvi3+9Kq1wah7S+cQiCq 1jHcvc7PFZfReTaG3YpHBmxJipUm4oFhmLT0AesojNugIm10xR3g8d3ogSj30A/JUyR91N4P nFKL1hkPVLQtUNZaxwCe8dSY+8C3DLQxjLLGWOSG6XXJ0vKjbIsdr68b817OaldNgBy4Yzgo 3IVBdCuWs7ayvVeLmzNV1wg2XwqUmGLEfQI5tllulEU5XHNcrWGDzGTkwymM29pPhaCtHHWp +ISeBrP8M=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHY82a4odYEEDJCoUOUarTvbCvSoq41M5sAgAAIFoCAAAWrAIAAA0GAgADz+gCAACCXgA==
  • Thread-topic: [PATCH for-4.17?] x86/paging: return -EINVAL for paging domctls for dying domains


> On 9 Nov 2022, at 07:48, Jan Beulich <jbeulich@xxxxxxxx> wrote:
> 
> On 08.11.2022 18:15, Roger Pau Monné wrote:
>> On Tue, Nov 08, 2022 at 06:03:54PM +0100, Jan Beulich wrote:
>>> On 08.11.2022 17:43, Roger Pau Monné wrote:
>>>> On Tue, Nov 08, 2022 at 05:14:40PM +0100, Jan Beulich wrote:
>>>>> On 08.11.2022 12:38, Roger Pau Monne wrote:
>>>>>> Like on the Arm side, return -EINVAL when attempting to do a p2m
>>>>>> operation on dying domains.
>>>>>> 
>>>>>> The current logic returns 0 and leaves the domctl parameter
>>>>>> uninitialized for any parameter fetching operations (like the
>>>>>> GET_ALLOCATION operation), which is not helpful from a toolstack point
>>>>>> of view, because there's no indication that the data hasn't been
>>>>>> fetched.
>>>>> 
>>>>> While I can see how the present behavior is problematic when it comes
>>>>> to consuming supposedly returned data, ...
>>>>> 
>>>>>> --- a/xen/arch/x86/mm/paging.c
>>>>>> +++ b/xen/arch/x86/mm/paging.c
>>>>>> @@ -694,9 +694,10 @@ int paging_domctl(struct domain *d, struct 
>>>>>> xen_domctl_shadow_op *sc,
>>>>>> 
>>>>>>     if ( unlikely(d->is_dying) )
>>>>>>     {
>>>>>> -        gdprintk(XENLOG_INFO, "Ignoring paging op on dying domain %u\n",
>>>>>> +        gdprintk(XENLOG_INFO,
>>>>>> +                 "Tried to do a paging domctl op on dying domain %u\n",
>>>>>>                  d->domain_id);
>>>>>> -        return 0;
>>>>>> +        return -EINVAL;
>>>>>>     }
>>>>> 
>>>>> ... going from "success" to "failure" here has a meaningful risk of
>>>>> regressing callers. It is my understanding that it was deliberate to
>>>>> mimic success in this case (without meaning to assign "good" or "bad"
>>>>> to that decision).
>>>> 
>>>> I would assume that was the original intention, yes, albeit the commit
>>>> message doesn't go into details about why mimicking success is
>>>> required, it's very well possible the code relying on this was xend.
>>> 
>>> Quite possible, but you never know who else has cloned code from there.
>>> 
>>>>> Can you instead fill the data to be returned in
>>>>> some simple enough way? I assume a mere memset() isn't going to be
>>>>> good enough, though (albeit public/domctl.h doesn't explicitly name
>>>>> any input-only fields, so it may not be necessary to preserve
>>>>> anything). Maybe zeroing ->mb and ->stats would do?
>>>> 
>>>> Hm, it still feels kind of wrong.  We do return errors elsewhere for
>>>> operations attempted against dying domains, and that seems all fine,
>>>> not sure why paging operations need to be different in this regard.
>>>> Arm does also return -EINVAL in that case.
>>>> 
>>>> So what about postponing this change to 4.18 in order to avoid
>>>> surprises, but then taking it in its current form at the start of the
>>>> development window, as to have time to detect any issues?
>>> 
>>> Maybe, but to be honest I'm not convinced. Arm can't really be taken
>>> for comparison, since the op is pretty new there iirc.
>> 
>> Indeed, but the tools code paths are likely shared between x86 and
>> Arm, as the hypercalls are the same.
> 
> On x86 we have both xc_shadow_control() and (functional)
> xc_logdirty_control(); on Arm only the former is used, while the latter
> would also be impacted by your change. Plus you're not accounting for
> external tool stacks (like xend would be if anyone had cared to forward
> port it, when - as you said earlier - the suspicion is that the original
> change was made to "please" xend).

I don't see how returning random uninitialised data (current behaviour) or 
wrong data (all zeroes) is better than returning an explicit error code.

Toolstacks (e.g. XAPI) only seemingly "work" with the current behaviour, in 
that it is trying to detect invalid data being returned,
and the chances of the invalid data being >1000 is more likely than it being 
<1000, so "most of the time" the toolstack happens to work.

There was a comment in the code that the bug might be in the binding, but the 
binding itself seems fine (well that one at least, there might be memory 
corruption in other bindings
which is probably what the comment was hinting at...), which is probably why 
the original investigation of this bug stopped and worked around the problem 
toolstack
side instead of digging into the hypervisor to find the real bug, it probably 
trusted the hypervisor being correct too much.

Except when the random data is between 1 and 1000, in which case it accepts 
that value as plausible and things break, 
but not in a way in which you can reproduce (you need to rerun many times until 
the value on the stack happens to be in this range).

And for the latter the toolstack needs to be changed anyway to detect 0 as a 
bad value (i.e. same as an exception), and it already has code
to detect exceptions from these domctls.

Even if such a change was made to "please xend" it is very likely just hiding 
the bad/uninitialized behaviour the same way XAPI does,
it doesn't necessarily mean it worked reliably.

And even this change happens to break some code, it'd do so with a clear error 
code and in a reproducible way, and then such breakage can be easily fixed in 
the affected piece of code
(e.g. a toolstack other than XAPI which was not looking for errors from this 
domctl call)
Using an error code other than EINVAL as you suggest might even make finding 
those problems more obvious.

AFAICT XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION is only used in the OCaml C stubs 
(i.e. used XAPI/xenopsd) which always checks return value
and raises OCaml exception and in python lowlevel libs, which checks return 
value and raises python exception.

> 
>> This is a domctl interface, so we are fine to do such changes.
> 
> We're fine to make changes to domctl which are either binary compatible
> with earlier versions or which are associated with a bump of the
> interface version. The latter wouldn't help in this case, while the
> former is simply not true here. For Andrew's proposed new paging pool
> interface the behavior suggested here would of course be fully
> appropriate, demanding that tool stack either don't issue such requests
> against dying domains or that they be prepared to get back errors.
> 
> Thinking about it again I'm also not convinced EINVAL is an appropriate
> error code to use here. The operation isn't necessarily invalid; we
> only prefer to not carry out any such anymore. EOPNOTSUPP, EPERM, or
> EACCES would all seem more appropriate. Or, for ease of recognition, a
> rarely used one, e.g. ENODATA, EILSEQ, or EROFS.

EROFS on a read-only query (OP_GET_ALLOCATION) would seem wrong.
Similarly EPERM/EACCES when you're Dom0 would make someone trying to dig in the 
wrong place.

Other places in the code also may return EINVAL,ESRCH, EBUSY, but not sure how 
many of those may reach userspace,
ESRCH would seem appropriate.
EOPNOTSUP,ENODATA might be ok too.

> 
> Finally I'm not convinced of the usefulness of this dying check in the
> first place: is_dying may become set immediately after the check was
> done.


Indeed, there is an inherent race condition here (which is why the toolstack 
can't make this check prior to calling the domctl)
There are other places in the code where there is a BUG_ON on !is_dying, are 
they protected by locks,
or do they have a race condition where a dying domain can trigger it?

Best regards,
--Edwin

> 
> Jan
> 
>> I
>> understand that we want to avoid such interface changes as much as
>> possible, but I think we need to fix the hypercall to return error
>> codes rather than implementing workarounds to try to cope with a wrong
>> interface behavior in the first place.  Or else we could be
>> accumulation workarounds here in order to fool caller into thinking
>> the hypercall has somehow succeed, and provide kind of suitable
>> looking data for the output parameters.
> 


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.