[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "Cooper, Andrew" <andrew.cooper3@xxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Fri, 23 Oct 2020 06:14:18 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=N+4hRcPBg99tlqBnoQ1lN1cY0rIsJir5Oqdog60o+EE=; b=YWJ9rpaQvdTyzBH0/jp8QxRU2EQO4RrYL1qVg8NUWKynWvoQwEjFqQmWz37dg574Jy78VToz+HvTFv1awHgmKix8f7F6SyQd008cuMydmdQiLjOscbG3eD6JlJqiUqmmnqCcVcQiKXClXPG1PnLnp96l4U65WkoSVD8nImunCp7LJNSjwBCMv6xYsUeafS1gX78xqrbZQskG8q9xqoImMgaLWzXx9ntCJRMi2ZDFRsXzyhv936kWyoCFFDhc3MoauYDUlkAxdwZw0BuEOmE2qPQHaaqPW3yfNjArvxQWXJjoiJgTrTzFvhLA1PywmRzTfZ2BKAb4JQLQikDiY/AwSQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WLVWEOAJ86jajKGH0M9WMFacu1xI5OGoviQYsb0Jz+KWmWlJ/kGzoCzWr3q0RMRgaMXGqJ/8aSddW+JjCAidK1Yj/z/sPS2PEFEYVqTvZXQZweqgLsXvCFkOLoAfWTzMa7u5h0PAMzcMnWXtQHddVzYNFJrjK6hDC0Wahx85+ZmKndv/RLFdvWVJ2bKCpbHMCrrvP+PqN5z1rAZMO6mzOjPd5VKN6naUShezf9YWS9U8mY1mbLYC7IEbIpwevXBzbws5tJtSjShF1vNxcy2gYkj82Mpz4mj0hxPkhU7gmlgC/+Ovm2imQ/65ltIoiIBfRjY3+102yD9rZJsAv452Vg==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=intel.com;
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 23 Oct 2020 06:15:04 +0000
  • Dlp-product: dlpe-windows
  • Dlp-reaction: no-action
  • Dlp-version: 11.5.1.3
  • Ironport-sdr: L/bTAiSO9kjSHnSN2uvOTj7xOAodvbqofjbh50JrVDNJzmySe0sG9FfOI8P1SKOjaXkD6OVF7a nTR9wqHXZwlg==
  • Ironport-sdr: fNoKPtKDIoEM1Ovyrm8WnfHIz35JN5/rAHF/wTQPGuonrvcAsJGl5QpUrbuR625UyUTUqoG1Zz kswpY+B4iXDA==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWnk5UOwH5YPFK10i1vYA9BQPqg6mVtt4AgAFwXACAAS7uAIACEe2AgARKRQCAAHZigIABC12AgASUbqA=
  • Thread-topic: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"

> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Tuesday, October 20, 2020 4:10 PM
> 
> On 19.10.2020 18:12, Andrew Cooper wrote:
> > On 19/10/2020 10:09, Jan Beulich wrote:
> >> On 16.10.2020 17:38, Andrew Cooper wrote:
> >>> On 15/10/2020 09:01, Jan Beulich wrote:
> >>>> On 14.10.2020 15:57, Andrew Cooper wrote:
> >>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry
> >>>>> failure if it can be triggered by userspace, but the latter safer and
> >>>>> much more obvious.
> >>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more
> >>>> than reporting a #GP fault when one is going to be reported
> >>>> anyway (as long as the VM entry doesn't fail, and hence the
> >>>> guest won't get crashed). IOW this raising of #GP actually is a
> >>>> precautionary measure to _avoid_ XSAs.
> >>> It does not remove any XSAs.  It merely hides them.
> >> How that? If we convert the ability of guest user mode to crash
> >> the guest into deliver of #GP(0), how is there a hidden XSA then?
> >
> > Because userspace being able to triggering this fixup is still an XSA.
> 
> How do you know without a specific case at hand? It may well be
> that all that's impacted is guest user space, in which case I
> don't see why there would need to be an XSA. It's still a bug
> then, sure.
> 
> >>>>> It was the appropriate security fix (give or take the functional bug in
> >>>>> it) at the time, given the complexity of retrofitting zero length
> >>>>> instruction fetches to the emulator.
> >>>>>
> >>>>> However, it is one of a very long list of guest-state-induced VMEntry
> >>>>> failures, with non-trivial logic which we assert will pass, on a
> >>>>> fastpath, where hardware also performs the same checks and we
> already
> >>>>> have a runtime safe way of dealing with errors.  (Hence not actually
> >>>>> using ASSERT_UNREACHABLE() here.)
> >>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
> >>>> for the guest at all, as vmx_failed_vmentry() results in an
> >>>> unconditional domain_crash().
> >>> Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
> >>> an XSA, *irrespective* of whether we crash the domain then and there,
> or
> >>> whether we let it try and limp on with corrupted state.
> >> Allowing the guest to continue with corrupted state is not a
> >> useful thing to do, I agree. However, what falls under
> >> "corrupted" seems to be different for you and me. I'd not call
> >> delivery of #GP "corruption" in any way.
> >
> > I can only repeat my previous statement:
> >
> >> There are legal states where RIP is 0x0000800000000000 and #GP is the
> >> wrong thing to do.
> >
> > Blindly raising #GP in is not always the right thing to do.
> 
> Again - we're in agreement about "blindly". Let's be less blind.
> 
> >>  The primary goal ought
> >> to be that we don't corrupt the guest kernel view of the world.
> >> It may then have the opportunity to kill the offending user
> >> mode process.
> >
> > By the time we have hit this case, all bets are off, because Xen *is*
> > malfunctioning.  We have no idea if kernel context is still intact.  You
> > don't even know that current user context is the correct offending
> > context to clobber, and might be creating a user=>user DoS vulnerability.
> >
> > We definitely have an XSA to find and fix, and we can either make it
> > very obvious and likely to be reported, or hidden and liable to go
> > unnoticed for a long period of time.
> 
> Why would it go unnoticed when we log the incident? I very much
> hope people inspect their logs at least every once in a while ...
> 
> And as per above - I disagree with your use of "definitely" here.
> We have a bug to analyze and fix, yes. Whether it's an XSA-worthy
> one isn't known ahead of time, unless we crash the guest in such
> a case.
> 
> In any event I think it's about time that the VMX maintainers
> voice their views here, as they're the ones to approve of
> whichever change we end up with. Kevin, Jun?
> 

Honestly speaking both of your options make some sense. and
I don't think there is a perfect answer here. Personally I'm more
aligned with Jan's point on preventing guest user from crashing
the whole domain. But let's also hear from others' opinions as 
I believe this dilemma might be seen in other places too thus 
may need a general consensus in Xen community.

Thanks
Kevin

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.