Xen project Mailing List

Re: [Xen-devel] [PATCH] xen: only clobber multicall elements without error

To: Julien Grall <julien.grall@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>

From: Juergen Gross <jgross@xxxxxxxx>

Date: Tue, 27 Nov 2018 08:34:22 +0100

Autocrypt: addr=jgross@xxxxxxxx; prefer-encrypt=mutual; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNHkp1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmRlPsLAeQQTAQIAIwUCU4xw6wIbAwcL CQgHAwIBBhUIAgkKCwQWAgMBAh4BAheAAAoJELDendYovxMvi4UH/Ri+OXlObzqMANruTd4N zmVBAZgx1VW6jLc8JZjQuJPSsd/a+bNr3BZeLV6lu4Pf1Yl2Log129EX1KWYiFFvPbIiq5M5 kOXTO8Eas4CaScCvAZ9jCMQCgK3pFqYgirwTgfwnPtxFxO/F3ZcS8jovza5khkSKL9JGq8Nk czDTruQ/oy0WUHdUr9uwEfiD9yPFOGqp4S6cISuzBMvaAiC5YGdUGXuPZKXLpnGSjkZswUzY d9BVSitRL5ldsQCg6GhDoEAeIhUC4SQnT9SOWkoDOSFRXZ+7+WIBGLiWMd+yKDdRG5RyP/8f 3tgGiB6cyuYfPDRGsELGjUaTUq3H2xZgIPfOwE0EU4xwFgEIAMsx+gDjgzAY4H1hPVXgoLK8 B93sTQFN9oC6tsb46VpxyLPfJ3T1A6Z6MVkLoCejKTJ3K9MUsBZhxIJ0hIyvzwI6aYJsnOew cCiCN7FeKJ/oA1RSUemPGUcIJwQuZlTOiY0OcQ5PFkV5YxMUX1F/aTYXROXgTmSaw0aC1Jpo w7Ss1mg4SIP/tR88/d1+HwkJDVW1RSxC1PWzGizwRv8eauImGdpNnseneO2BNWRXTJumAWDD pYxpGSsGHXuZXTPZqOOZpsHtInFyi5KRHSFyk2Xigzvh3b9WqhbgHHHE4PUVw0I5sIQt8hJq 5nH5dPqz4ITtCL9zjiJsExHuHKN3NZsAEQEAAcLAXwQYAQIACQUCU4xwFgIbDAAKCRCw3p3W KL8TL0P4B/9YWver5uD/y/m0KScK2f3Z3mXJhME23vGBbMNlfwbr+meDMrJZ950CuWWnQ+d+ Ahe0w1X7e3wuLVODzjcReQ/v7b4JD3wwHxe+88tgB9byc0NXzlPJWBaWV01yB2/uefVKryAf AHYEd0gCRhx7eESgNBe3+YqWAQawunMlycsqKa09dBDL1PFRosF708ic9346GLHRc6Vj5SRA UTHnQqLetIOXZm3a2eQ1gpQK9MmruO86Vo93p39bS1mqnLLspVrL4rhoyhsOyh0Hd28QCzpJ wKeHTd0MAWAirmewHXWPco8p1Wg+V+5xfZzuQY0f4tQxvOpXpt4gQ1817GQ5/Ed/wsDtBBgB CAAgFiEEhRJncuj2BJSl0Jf3sN6d1ii/Ey8FAlrd8NACGwIAgQkQsN6d1ii/Ey92IAQZFggA HRYhBFMtsHpB9jjzHji4HoBcYbtP2GO+BQJa3fDQAAoJEIBcYbtP2GO+TYsA/30H/0V6cr/W V+J/FCayg6uNtm3MJLo4rE+o4sdpjjsGAQCooqffpgA+luTT13YZNV62hAnCLKXH9n3+ZAgJ RtAyDWk1B/0SMDVs1wxufMkKC3Q/1D3BYIvBlrTVKdBYXPxngcRoqV2J77lscEvkLNUGsu/z W2pf7+P3mWWlrPMJdlbax00vevyBeqtqNKjHstHatgMZ2W0CFC4hJ3YEetuRBURYPiGzuJXU pAd7a7BdsqWC4o+GTm5tnGrCyD+4gfDSpkOT53S/GNO07YkPkm/8J4OBoFfgSaCnQ1izwgJQ jIpcG2fPCI2/hxf2oqXPYbKr1v4Z1wthmoyUgGN0LPTIm+B5vdY82wI5qe9uN6UOGyTH2B3p hRQUWqCwu2sqkI3LLbTdrnyDZaixT2T0f4tyF5Lfs+Ha8xVMhIyzNb1byDI5FKCb

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 27 Nov 2018 07:34:48 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Openpgp: preference=signencrypt

On 26/11/2018 17:51, Julien Grall wrote: > > > On 26/11/2018 16:17, Juergen Gross wrote: >> On 26/11/2018 17:01, Julien Grall wrote: >>> >>> >>> On 26/11/2018 15:29, Juergen Gross wrote: >>>> On 26/11/2018 15:58, Jan Beulich wrote: >>>>>>>> On 26.11.18 at 15:23, <jgross@xxxxxxxx> wrote: >>>>>> On 26/11/2018 15:01, Jan Beulich wrote: >>>>>>>>>> On 26.11.18 at 14:52, <jgross@xxxxxxxx> wrote: >>>>>>>> I don't think the hypervisor should explicitly try to make it as >>>>>>>> hard as >>>>>>>> possible for the guest to find problems in the code. >>>>>>> >>>>>>> That's indeed not the hypervisor's goal. Instead it tries to make >>>>>>> it as hard as possible for the guest (developer) to make wrong >>>>>>> assumptions. >>>>>> >>>>>> Let's look at the current example why I wrote this patch: >>>>>> >>>>>> The Linux kernel's use of multicalls should never trigger any single >>>>>> call to return an error (return value < 0). A kernel compiled for >>>>>> productive use will catch such errors, but has no knowledge which >>>>>> single call has failed, as it doesn't keep track of the single >>>>>> entries >>>>>> (non-productive kernels have an option available in the respective >>>>>> source to copy the entries before doing the multicall in order to >>>>>> have >>>>>> some diagnostic data available in case of such an error). Catching an >>>>>> error from a multicall right now means a WARN() with a stack >>>>>> backtrace >>>>>> (for the multicall itself, not for the entry causing the error). >>>>>> >>>>>> I have a customer report for a case where such a backtrace was >>>>>> produced >>>>>> and a kernel crash some seconds later, obviously due to illegally >>>>>> unmapped memory pages resulting from the failed multicall. >>>>>> Unfortunately >>>>>> there are multiple possibilities what might have gone wrong and I >>>>>> don't >>>>>> know which one was the culprit. The problem can't be a very common >>>>>> one, >>>>>> because there is only one such report right now, which might >>>>>> depend on >>>>>> a special driver. >>>>>> >>>>>> Finding this bug without a known reproducer and the current amount of >>>>>> diagnostic data is next to impossible. So I'd like to have more data >>>>>> available without having to hurt performance for the 99.999999% of >>>>>> the >>>>>> cases where nothing bad happens. >>>>>> >>>>>> In case you have an idea how to solve this problem in another way >>>>>> I'd be >>>>>> happy to follow that route. I'd really like to be able to have a >>>>>> better >>>>>> clue in case such an error occurs in future. >>>>> >>>>> Since you have a production kernel, I assume you also have a >>>>> production hypervisor. This hypervisor doesn't clobber the >>>>> arguments if I'm not mistaken. Therefore >>>>> - in the debugging scenario you (can) have all data available by >>>>> virtue of the information getting copied in the kernel, >>>>> - in the release scenario you have all data available since it's >>>>> left un-clobbered. >>>>> Am I missing anything (I don't view mixed debug/release setups >>>>> of kernel and hypervisor as overly important here)? >>>> >>>> No, you are missing nothing here. OTOH a debug hypervisor destroying >>>> debug data is kind of weird, so I posted this patch. >>> >>> This is a quite common approach if you want to enforce the other entity >>> to not rely on some fields. This also follows what we do for hypercalls >>> (at least on Arm). >> >> I don't question that general mechanism. >> >> What I question is doing the clobbering in this case where the caller >> would need to take special measures for copying the needed debug data >> which will hit performance. So not doing the clobbering would only be >> in the very rare error case, not in the common case where the guest >> should really have no need to see the preserved data. >> >> I really fail to see why it is so bad to not clobber data in a case >> which normally should never occur. >> The only outcome of clobbering the >> data in the error case is making diagnosis of that error much harder. >> Its not as if there would be secret data suddenly made available to the >> guest. Its just avoiding the need for the guest to copy the data for >> each multicall for the very unlikely chance an error might occur. And we >> are not speaking of a hypercall issued then and now, but of the path hit >> for nearly every memory-management action and context switch of PV >> guests. So doing always the copy would really be visible. > > For a first, now the behavior will not be the same as when handling a > single hypercall. I would much prefer if the behavior is kept the same > everywhere. Clobbering the multicall parameters in memory and the registers of a hypercall are different from the beginning. Register clobbering is very valuable as it is easy to mess up register usage on the guest side (just had this issue in grub2 where re-using a register value would occur only without diagnostic prints enabled due to inlining). I fail to see how it is possible to re-use multicall arguments other than by explicitly programming it this way which _is_ wrong, of course. > Secondly, you are introducing an ABI change without explicitly telling > it. This should be clarified in the public interface and probably the > multicall code to avoid re-introducing the clobbering in the future. No, I don't think I'm introducing an ABI change. There is no guarantee Xen will clobber all multicall arguments. In fact a NDEBUG hypervisor never does. The guest must not rely on non-clobbered values, of course. > But I am concerned that the conditional clobbering (i.e depending on the > return) will be overlooked by the guest and will probably introduce some > more interesting^weird bug. Maybe I'm looking at it too much from the Linux kernel side. OTOH I still believe the pros of my patch (better diagnostics in case of rare, but actually observed, failures) outweigh the cons (theoretical bugs which are not known today and can show up with NDEBUG hypervisors nevertheless). I can live with my patch being rejected due to the NDEBUG behavior being not to clobber the data, but I still think the reasons for rejecting the patch are using the wrong priorities (purely theoretical assumptions above actual problems). Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.