[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"
> From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: Tuesday, October 20, 2020 4:10 PM > > On 19.10.2020 18:12, Andrew Cooper wrote: > > On 19/10/2020 10:09, Jan Beulich wrote: > >> On 16.10.2020 17:38, Andrew Cooper wrote: > >>> On 15/10/2020 09:01, Jan Beulich wrote: > >>>> On 14.10.2020 15:57, Andrew Cooper wrote: > >>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry > >>>>> failure if it can be triggered by userspace, but the latter safer and > >>>>> much more obvious. > >>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more > >>>> than reporting a #GP fault when one is going to be reported > >>>> anyway (as long as the VM entry doesn't fail, and hence the > >>>> guest won't get crashed). IOW this raising of #GP actually is a > >>>> precautionary measure to _avoid_ XSAs. > >>> It does not remove any XSAs. It merely hides them. > >> How that? If we convert the ability of guest user mode to crash > >> the guest into deliver of #GP(0), how is there a hidden XSA then? > > > > Because userspace being able to triggering this fixup is still an XSA. > > How do you know without a specific case at hand? It may well be > that all that's impacted is guest user space, in which case I > don't see why there would need to be an XSA. It's still a bug > then, sure. > > >>>>> It was the appropriate security fix (give or take the functional bug in > >>>>> it) at the time, given the complexity of retrofitting zero length > >>>>> instruction fetches to the emulator. > >>>>> > >>>>> However, it is one of a very long list of guest-state-induced VMEntry > >>>>> failures, with non-trivial logic which we assert will pass, on a > >>>>> fastpath, where hardware also performs the same checks and we > already > >>>>> have a runtime safe way of dealing with errors. (Hence not actually > >>>>> using ASSERT_UNREACHABLE() here.) > >>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe > >>>> for the guest at all, as vmx_failed_vmentry() results in an > >>>> unconditional domain_crash(). > >>> Any VMEntry failure is a bug in Xen. If userspace can trigger it, it is > >>> an XSA, *irrespective* of whether we crash the domain then and there, > or > >>> whether we let it try and limp on with corrupted state. > >> Allowing the guest to continue with corrupted state is not a > >> useful thing to do, I agree. However, what falls under > >> "corrupted" seems to be different for you and me. I'd not call > >> delivery of #GP "corruption" in any way. > > > > I can only repeat my previous statement: > > > >> There are legal states where RIP is 0x0000800000000000 and #GP is the > >> wrong thing to do. > > > > Blindly raising #GP in is not always the right thing to do. > > Again - we're in agreement about "blindly". Let's be less blind. > > >> The primary goal ought > >> to be that we don't corrupt the guest kernel view of the world. > >> It may then have the opportunity to kill the offending user > >> mode process. > > > > By the time we have hit this case, all bets are off, because Xen *is* > > malfunctioning. We have no idea if kernel context is still intact. You > > don't even know that current user context is the correct offending > > context to clobber, and might be creating a user=>user DoS vulnerability. > > > > We definitely have an XSA to find and fix, and we can either make it > > very obvious and likely to be reported, or hidden and liable to go > > unnoticed for a long period of time. > > Why would it go unnoticed when we log the incident? I very much > hope people inspect their logs at least every once in a while ... > > And as per above - I disagree with your use of "definitely" here. > We have a bug to analyze and fix, yes. Whether it's an XSA-worthy > one isn't known ahead of time, unless we crash the guest in such > a case. > > In any event I think it's about time that the VMX maintainers > voice their views here, as they're the ones to approve of > whichever change we end up with. Kevin, Jun? > Honestly speaking both of your options make some sense. and I don't think there is a perfect answer here. Personally I'm more aligned with Jan's point on preventing guest user from crashing the whole domain. But let's also hear from others' opinions as I believe this dilemma might be seen in other places too thus may need a general consensus in Xen community. Thanks Kevin
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |