[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
On 19.10.2020 18:12, Andrew Cooper wrote: > On 19/10/2020 10:09, Jan Beulich wrote: >> On 16.10.2020 17:38, Andrew Cooper wrote: >>> On 15/10/2020 09:01, Jan Beulich wrote: >>>> On 14.10.2020 15:57, Andrew Cooper wrote: >>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry >>>>> failure if it can be triggered by userspace, but the latter safer and >>>>> much more obvious. >>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more >>>> than reporting a #GP fault when one is going to be reported >>>> anyway (as long as the VM entry doesn't fail, and hence the >>>> guest won't get crashed). IOW this raising of #GP actually is a >>>> precautionary measure to _avoid_ XSAs. >>> It does not remove any XSAs. It merely hides them. >> How that? If we convert the ability of guest user mode to crash >> the guest into deliver of #GP(0), how is there a hidden XSA then? > > Because userspace being able to triggering this fixup is still an XSA. How do you know without a specific case at hand? It may well be that all that's impacted is guest user space, in which case I don't see why there would need to be an XSA. It's still a bug then, sure. >>>>> It was the appropriate security fix (give or take the functional bug in >>>>> it) at the time, given the complexity of retrofitting zero length >>>>> instruction fetches to the emulator. >>>>> >>>>> However, it is one of a very long list of guest-state-induced VMEntry >>>>> failures, with non-trivial logic which we assert will pass, on a >>>>> fastpath, where hardware also performs the same checks and we already >>>>> have a runtime safe way of dealing with errors. (Hence not actually >>>>> using ASSERT_UNREACHABLE() here.) >>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe >>>> for the guest at all, as vmx_failed_vmentry() results in an >>>> unconditional domain_crash(). >>> Any VMEntry failure is a bug in Xen. If userspace can trigger it, it is >>> an XSA, *irrespective* of whether we crash the domain then and there, or >>> whether we let it try and limp on with corrupted state. >> Allowing the guest to continue with corrupted state is not a >> useful thing to do, I agree. However, what falls under >> "corrupted" seems to be different for you and me. I'd not call >> delivery of #GP "corruption" in any way. > > I can only repeat my previous statement: > >> There are legal states where RIP is 0x0000800000000000 and #GP is the >> wrong thing to do. > > Blindly raising #GP in is not always the right thing to do. Again - we're in agreement about "blindly". Let's be less blind. >> The primary goal ought >> to be that we don't corrupt the guest kernel view of the world. >> It may then have the opportunity to kill the offending user >> mode process. > > By the time we have hit this case, all bets are off, because Xen *is* > malfunctioning. We have no idea if kernel context is still intact. You > don't even know that current user context is the correct offending > context to clobber, and might be creating a user=>user DoS vulnerability. > > We definitely have an XSA to find and fix, and we can either make it > very obvious and likely to be reported, or hidden and liable to go > unnoticed for a long period of time. Why would it go unnoticed when we log the incident? I very much hope people inspect their logs at least every once in a while ... And as per above - I disagree with your use of "definitely" here. We have a bug to analyze and fix, yes. Whether it's an XSA-worthy one isn't known ahead of time, unless we crash the guest in such a case. In any event I think it's about time that the VMX maintainers voice their views here, as they're the ones to approve of whichever change we end up with. Kevin, Jun? Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |