[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"

  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 16 Oct 2020 16:38:09 +0100
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>
  • Delivery-date: Fri, 16 Oct 2020 15:38:32 +0000
  • Ironport-sdr: JDYWFBkFk317Y2XP/xaIkbXtnoKGnoHX5RPpuHXdnj+9AzQtlTcIsFRaGtDiPOHstWt6sdob6m m1u/d0ZwroF3Lk9ROJJipbnv3/PO+eVcUKaH9TkPHlMBYI0gQBg2gP2NxvGrXwpIOgHnEe+7hy DrRrXFpdHnswi3n3kiWCPEg7XveeMV8Je2zXQ+U9ONTT+wHOCx/zetofLEKV1+760yIK0Fy1Hk JsudpGnsjMr1uQ0yM1oa9ygNYT7wxKK1S/T7wqhpN2gIREf63INsHE1iJsC3Rw/WLpXksE2mQw 1F4=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15/10/2020 09:01, Jan Beulich wrote:
> On 14.10.2020 15:57, Andrew Cooper wrote:
>> On 13/10/2020 16:58, Jan Beulich wrote:
>>> On 09.10.2020 17:09, Andrew Cooper wrote:
>>>> At the time of XSA-170, the x86 instruction emulator really was broken, and
>>>> would allow arbitrary non-canonical values to be loaded into %rip.  This 
>>>> was
>>>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch
>>>> targets".
>>>> However, in a demonstration that off-by-one errors really are one of the
>>>> hardest programming issues we face, everyone involved with XSA-170, myself
>>>> included, mistook the statement in the SDM which says:
>>>>   If the processor supports N < 64 linear-address bits, bits 63:N must be 
>>>> identical
>>>> to mean "must be canonical".  A real canonical check is bits 63:N-1.
>>>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to 
>>>> cater
>>>> to the boundary condition at 0x0000800000000000.
>>>> Now that the emulator has been fixed, revert the XSA-170 change to fix
>>>> architectural behaviour at the boundary case.  The XTF test case for 
>>>> XSA-170
>>>> exercises this corner case, and still passes.
>>>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest")
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>> But why revert the change rather than fix ...
>>>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>>>>  out:
>>>>      if ( nestedhvm_vcpu_in_guestmode(v) )
>>>>          nvmx_idtv_handling();
>>>> -
>>>> -    /*
>>>> -     * VM entry will fail (causing the guest to get crashed) if rIP (and
>>>> -     * rFLAGS, but we don't have an issue there) doesn't meet certain
>>>> -     * criteria. As we must not allow less than fully privileged mode to 
>>>> have
>>>> -     * such an effect on the domain, we correct rIP in that case 
>>>> (accepting
>>>> -     * this not being architecturally correct behavior, as the injected 
>>>> #GP
>>>> -     * fault will then not see the correct [invalid] return address).
>>>> -     * And since we know the guest will crash, we crash it right away if 
>>>> it
>>>> -     * already is in most privileged mode.
>>>> -     */
>>>> -    mode = vmx_guest_x86_mode(v);
>>>> -    if ( mode == 8 ? !is_canonical_address(regs->rip)
>>> ... the wrong use of is_canonical_address() here? By reverting
>>> you open up avenues for XSAs in case we get things wrong elsewhere,
>>> including ...
>>>> -                   : regs->rip != regs->eip )
>>> ... for 32-bit guests.
>> Because the only appropriate alternative would be ASSERT_UNREACHABLE()
>> and domain crash.
>> This logic corrupts guest state.
>> Running with corrupt state is every bit an XSA as hitting a VMEntry
>> failure if it can be triggered by userspace, but the latter safer and
>> much more obvious.
> I disagree. For CPL > 0 we don't "corrupt" guest state any more
> than reporting a #GP fault when one is going to be reported
> anyway (as long as the VM entry doesn't fail, and hence the
> guest won't get crashed). IOW this raising of #GP actually is a
> precautionary measure to _avoid_ XSAs.

It does not remove any XSAs.  It merely hides them.

There are legal states where RIP is 0x0000800000000000 and #GP is the
wrong thing to do.  Any async VMExit (Processor Trace Prefetch in
particular), or with debug traps pending.

> Nor do I agree with the "much more obvious" aspect:

A domain crash is far more likely to be reported to xen-devel/security
than something which bodges state in an almost-silent way.

> A VM entry
> failure requires quite a bit of analysis to recognize what has
> caused it; whether a non-pseudo-canonical RIP is what catches your
> eye right away is simply unknown. The gprintk() that you delete,
> otoh, says very clearly what we have found to be wrong.

Non-canonical values are easier to spot than most of the other rules, IMO.

>> It was the appropriate security fix (give or take the functional bug in
>> it) at the time, given the complexity of retrofitting zero length
>> instruction fetches to the emulator.
>> However, it is one of a very long list of guest-state-induced VMEntry
>> failures, with non-trivial logic which we assert will pass, on a
>> fastpath, where hardware also performs the same checks and we already
>> have a runtime safe way of dealing with errors.  (Hence not actually
>> using ASSERT_UNREACHABLE() here.)
> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
> for the guest at all, as vmx_failed_vmentry() results in an
> unconditional domain_crash().

Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
an XSA, *irrespective* of whether we crash the domain then and there, or
whether we let it try and limp on with corrupted state.

The different is purely in how obviously the bug manifests.

> I certainly buy the fast path aspect of your comment, and if you were
> moving the guest state adjustment into vmx_failed_vmentry(), I'd be
> fine with the deletion here.
>> It isn't appropriate for this check to exist on its own (i.e. without
>> other guest state checks),
> Well, if we run into cases where we get things wrong, more checks
> and adjustments may want adding. Sadly each one of those has a fair
> chance of needing an XSA.

We've had plenty of VMEntry failure XSAs, and we will no doubt have
plenty more in the future.  A non-canonical RIP is not special as far as
these go.

We absolutely should not be doing any fixup in debug builds.  I don't
see any security benefit for doing it in release builds, and an
important downside in terms of getting the bug noticed, and therefore fixed.

> As an aside, nvmx_n2_vmexit_handler()'s handling of
> VMX_EXIT_REASONS_FAILED_VMENTRY looks pretty bogus - this is a flag,
> not a separate exit reason. I guess I'll make a patch ...

This is far from the only problem.  I'm not intending to fix any issues
I find, until I can start getting some proper nested virt functionality
tests in place.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.