Xen project Mailing List

Re: Revert of the 4.17 hypercall handler changes Re: [PATCH-for-4.17] xen: fix generated code for calling hypercall handlers

To: George Dunlap <George.Dunlap@xxxxxxxxxx>

From: Jan Beulich <jbeulich@xxxxxxxx>

Date: Thu, 10 Nov 2022 09:09:17 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HVSJXv1ijN7Q+dCnzUCeiK3AKBL9AoHcmMUtt6QAh6c=; b=DFI2jdyHNUZRSdqbTJ5nGm4jbaGQepRziYxRevLFNvucn0bs7hXcW5IQvq/9RlRZH9BJxAD7Ixvp75H+V2mW1bLulFqVR4OOx4hdbc4u0nrWIeBnyog4ffwmGIQXGk1odTHAXk7bxOrQLH3bFDjTqi+4xOKoOJFt2q/PRF7wN2JiJ5n2tIGqZ3EuB+yn+lPVnttpZHnaqVUpwoZZ4K7InZ0ySxo+v9W75EMobjDCerDxSpNFbfCQBAENL4MrPGWhHGgorY8CppAGlamKciaj3WNZ/npWNMHQwg/7SjRozjI8M0Lg6j61k2tk53amYXyLZTWiKES/2E20CNzwh7V1kg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=imXC9ewODNoFcLlZ8g2twtnP7k7RwIycWi7DgFZ845mJTIUf8rswoDuYbKf6hEEk9wtsffeQCRkXe1pfmk+n+f2OJzRRj4tSc5UpOfUQAkMaL0mZu2U7UoG1mF+DwGSe1dRZoR1mjKv83iZepS53fKM4M7McDEBFvWc+JvQo/rppqGfc61FGCR14/12eeyitZUKjEZj0kd6xZJajNHHG6WLZzFNJ48uNMj/9YzrJIlN2oKkNmalVcukKxzvCsj8rrLmblTrYrXWruIO47NxPKtHDot38MkkKmji7MFjQbVhmcx1iy/ELC+sGVgmCPvk36Ixgn5tjGW3SVvCCEPGnfA==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: Juergen Gross <jgross@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Henry Wang <Henry.Wang@xxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>

Delivery-date: Thu, 10 Nov 2022 08:09:30 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 09.11.2022 21:16, George Dunlap wrote: >> On 4 Nov 2022, at 05:01, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx> wrote: >> On 03/11/2022 16:36, Juergen Gross wrote: >>> The code generated for the call_handlers_*() macros needs to avoid >>> undefined behavior when multiple handlers share the same priority. >>> The issue is the hypercall number being unverified fed into the macros >>> and then used to set a mask via "mask = 1ULL << <hypercall-number>". >>> >>> Avoid a shift amount of more than 63 by setting mask to zero in case >>> the hypercall number is too large. >>> >>> Fixes: eca1f00d0227 ("xen: generate hypercall interface related code") >>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx> >> >> This is not a suitable fix. There being a security issue is just the >> tip of the iceberg. > > At the x86 Maintainer’s meeting on Monday, we (Andrew, Jan, and I) talked > about this patch. Here is my summary of the conversation (with the caveat > that I may get some of the details wrong). Just a couple of remarks, mainly to extend context: > The proposed benefits of the series are: > > 1. By removing indirect calls, it removes those as a “speculative attack > surface”. > > 2. By removing indirect calls, it provides some performance benefit, since > indirect calls require an extra memory fetch. > > 3. It avoids casting function pointers to function pointers of a different > type. Our current practice is *technically* UB, and is incompatible with > some hardware safety mechanisms which we may want to take advantage of at > some point in the future; the series addresses both. > > There were two incidental technical problems pointed out: > > 1. A potential shift of more than 64 bytes, which is UB; this has been fixed. > > 2. The prototype for the kexec_op call was changed from unsigned long to > unsigned int; this is an ABI change which will cause differing behavior. Jan > will be looking at how he can fix this, now that it’s been noted. Patch was already sent and is now fully acked. Will go in later this morning. > But the more fundamental costs include: > > 1. The code is much more difficult now to reason about > > 2. The code is much larger > > 3. The long if/else chain could theoretically help hypercalls at the top if > the chain, but would definitely begin to hurt hypercalls at the bottom of the > chain; and the more hypercalls we add, the more of a theoretical performance > penalty this will have After Andrew's remark on how branch history works I've looked at Intel's ORM, finding that they actually recommend a hybrid approach: Frequently used numbers dealt with separately, infrequently used ones dealt with by a common indirect call. > 4. By using 64-bit masks, the implementation limits the number of hypercalls > to 64; a number we are likely to exceed if we implement ABIv2 to be > compatible with AMD SEV. This very much depends on how we encode the new hypercall numbers. In my proposal a single bit is used, and handlers remain the same. Therefore in that model there wouldn't really be an extension of hypercall numbers to cover here. > Additionally, there is a question about whether some of the alleged benefits > actually help: > > 1. On AMD processors, we enable IBRS, which completely removes indirect calls > as a speculative attack surface already. And on Intel processors, this > attack surface has already been significantly reduced. So removing indirect > calls is not as important an issue. > > 2. Normal branches are *also* a surface of speculative attacks; so even apart > from the above, all this series does is change one potential attack surface > for another one. > > 3. When we analyze theoretical performance with deep CPU pipelines and > speculation in mind, the theoretical disadvantage of indirect branches goes > away; and depending on the hardware, there is a theoretical performance > degradation. I'm inclined to change this to "may go away". As Andrew said on the call, an important criteria for the performance of indirect calls is how long it takes to recognize misprediction, and hence how much work needs to be thrown away and re-done. Which in turn means there's a more significant impact here when the rate of mis-predictions is higher. > 4. From a practical perspective, the performance tests are very much > insufficient to show either that this is an improvement, or that does not > cause a performance regression. To show that there hasn’t been a performance > degradation, a battery of tests needs to be done on hardware from a variety > of different vendors and cpu generations, since each of them will have > different properties after all speculative mitigations have been applied. > > So the argument is as follows: > > There is no speculative benefit for the series; there is insufficient > performance evidence, either to justify a performance benefit or to allay > doubts about a performance regression; and the benefit that there is > insufficient to counterbalance the costs, and so the series should be > reverted. > > At the end of the discussion, Jan and I agreed that Andrew had made a good > case for the series to be removed at some point. The discussion needs to be > concluded on the list, naturally; and if there is a consensus to remove the > series, the next question would be whether we should revert it now, before > 4.17.0, or wait until after the release and revert it then (perhaps with a > backport to 4.17.1). As per above a 3rd option may want considering: Only partially going back to the original model of using indirect calls (e.g. for all hypercalls which aren't explicitly assigned a priority). It's not just this which leaves me thinking that we shouldn't revert now, but instead take our time to decide what is going to be best long term. If then we still decide to fully revert, it can still be an option to do the revert also for 4.17.x (x > 0). Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.