Xen project Mailing List

RE: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"

To: Jan Beulich <jbeulich@xxxxxxxx>, "Cooper, Andrew" <andrew.cooper3@xxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>

From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>

Date: Fri, 23 Oct 2020 06:14:18 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=N+4hRcPBg99tlqBnoQ1lN1cY0rIsJir5Oqdog60o+EE=; b=YWJ9rpaQvdTyzBH0/jp8QxRU2EQO4RrYL1qVg8NUWKynWvoQwEjFqQmWz37dg574Jy78VToz+HvTFv1awHgmKix8f7F6SyQd008cuMydmdQiLjOscbG3eD6JlJqiUqmmnqCcVcQiKXClXPG1PnLnp96l4U65WkoSVD8nImunCp7LJNSjwBCMv6xYsUeafS1gX78xqrbZQskG8q9xqoImMgaLWzXx9ntCJRMi2ZDFRsXzyhv936kWyoCFFDhc3MoauYDUlkAxdwZw0BuEOmE2qPQHaaqPW3yfNjArvxQWXJjoiJgTrTzFvhLA1PywmRzTfZ2BKAb4JQLQikDiY/AwSQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WLVWEOAJ86jajKGH0M9WMFacu1xI5OGoviQYsb0Jz+KWmWlJ/kGzoCzWr3q0RMRgaMXGqJ/8aSddW+JjCAidK1Yj/z/sPS2PEFEYVqTvZXQZweqgLsXvCFkOLoAfWTzMa7u5h0PAMzcMnWXtQHddVzYNFJrjK6hDC0Wahx85+ZmKndv/RLFdvWVJ2bKCpbHMCrrvP+PqN5z1rAZMO6mzOjPd5VKN6naUShezf9YWS9U8mY1mbLYC7IEbIpwevXBzbws5tJtSjShF1vNxcy2gYkj82Mpz4mj0hxPkhU7gmlgC/+Ovm2imQ/65ltIoiIBfRjY3+102yD9rZJsAv452Vg==

Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=intel.com;

Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Fri, 23 Oct 2020 06:15:04 +0000

Dlp-product: dlpe-windows

Dlp-reaction: no-action

Dlp-version: 11.5.1.3

Ironport-sdr: L/bTAiSO9kjSHnSN2uvOTj7xOAodvbqofjbh50JrVDNJzmySe0sG9FfOI8P1SKOjaXkD6OVF7a nTR9wqHXZwlg==

Ironport-sdr: fNoKPtKDIoEM1Ovyrm8WnfHIz35JN5/rAHF/wTQPGuonrvcAsJGl5QpUrbuR625UyUTUqoG1Zz kswpY+B4iXDA==

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHWnk5UOwH5YPFK10i1vYA9BQPqg6mVtt4AgAFwXACAAS7uAIACEe2AgARKRQCAAHZigIABC12AgASUbqA=

Thread-topic: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"

> From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: Tuesday, October 20, 2020 4:10 PM > > On 19.10.2020 18:12, Andrew Cooper wrote: > > On 19/10/2020 10:09, Jan Beulich wrote: > >> On 16.10.2020 17:38, Andrew Cooper wrote: > >>> On 15/10/2020 09:01, Jan Beulich wrote: > >>>> On 14.10.2020 15:57, Andrew Cooper wrote: > >>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry > >>>>> failure if it can be triggered by userspace, but the latter safer and > >>>>> much more obvious. > >>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more > >>>> than reporting a #GP fault when one is going to be reported > >>>> anyway (as long as the VM entry doesn't fail, and hence the > >>>> guest won't get crashed). IOW this raising of #GP actually is a > >>>> precautionary measure to _avoid_ XSAs. > >>> It does not remove any XSAs. It merely hides them. > >> How that? If we convert the ability of guest user mode to crash > >> the guest into deliver of #GP(0), how is there a hidden XSA then? > > > > Because userspace being able to triggering this fixup is still an XSA. > > How do you know without a specific case at hand? It may well be > that all that's impacted is guest user space, in which case I > don't see why there would need to be an XSA. It's still a bug > then, sure. > > >>>>> It was the appropriate security fix (give or take the functional bug in > >>>>> it) at the time, given the complexity of retrofitting zero length > >>>>> instruction fetches to the emulator. > >>>>> > >>>>> However, it is one of a very long list of guest-state-induced VMEntry > >>>>> failures, with non-trivial logic which we assert will pass, on a > >>>>> fastpath, where hardware also performs the same checks and we > already > >>>>> have a runtime safe way of dealing with errors. (Hence not actually > >>>>> using ASSERT_UNREACHABLE() here.) > >>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe > >>>> for the guest at all, as vmx_failed_vmentry() results in an > >>>> unconditional domain_crash(). > >>> Any VMEntry failure is a bug in Xen. If userspace can trigger it, it is > >>> an XSA, *irrespective* of whether we crash the domain then and there, > or > >>> whether we let it try and limp on with corrupted state. > >> Allowing the guest to continue with corrupted state is not a > >> useful thing to do, I agree. However, what falls under > >> "corrupted" seems to be different for you and me. I'd not call > >> delivery of #GP "corruption" in any way. > > > > I can only repeat my previous statement: > > > >> There are legal states where RIP is 0x0000800000000000 and #GP is the > >> wrong thing to do. > > > > Blindly raising #GP in is not always the right thing to do. > > Again - we're in agreement about "blindly". Let's be less blind. > > >> The primary goal ought > >> to be that we don't corrupt the guest kernel view of the world. > >> It may then have the opportunity to kill the offending user > >> mode process. > > > > By the time we have hit this case, all bets are off, because Xen *is* > > malfunctioning. We have no idea if kernel context is still intact. You > > don't even know that current user context is the correct offending > > context to clobber, and might be creating a user=>user DoS vulnerability. > > > > We definitely have an XSA to find and fix, and we can either make it > > very obvious and likely to be reported, or hidden and liable to go > > unnoticed for a long period of time. > > Why would it go unnoticed when we log the incident? I very much > hope people inspect their logs at least every once in a while ... > > And as per above - I disagree with your use of "definitely" here. > We have a bug to analyze and fix, yes. Whether it's an XSA-worthy > one isn't known ahead of time, unless we crash the guest in such > a case. > > In any event I think it's about time that the VMX maintainers > voice their views here, as they're the ones to approve of > whichever change we end up with. Kevin, Jun? > Honestly speaking both of your options make some sense. and I don't think there is a perfect answer here. Personally I'm more aligned with Jan's point on preventing guest user from crashing the whole domain. But let's also hear from others' opinions as I believe this dilemma might be seen in other places too thus may need a general consensus in Xen community. Thanks Kevin

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.