[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/7] x86/shadow: Try to correctly identify implicit supervisor accesses



On 07/03/17 11:26, George Dunlap wrote:
> On 27/02/17 14:03, Andrew Cooper wrote:
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Tim Deegan <tim@xxxxxxx>
>> CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
>> ---
>>  xen/arch/x86/mm/shadow/multi.c | 60 
>> ++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 58 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
>> index 128809d..7c6b017 100644
>> --- a/xen/arch/x86/mm/shadow/multi.c
>> +++ b/xen/arch/x86/mm/shadow/multi.c
>> @@ -2857,7 +2857,7 @@ static int sh_page_fault(struct vcpu *v,
>>      const struct x86_emulate_ops *emul_ops;
>>      int r;
>>      p2m_type_t p2mt;
>> -    uint32_t rc;
>> +    uint32_t rc, error_code;
>>      int version;
>>      const struct npfec access = {
>>           .read_access = 1,
>> @@ -3011,13 +3011,69 @@ static int sh_page_fault(struct vcpu *v,
>>  
>>   rewalk:
>>  
>> +    error_code = regs->error_code;
>> +
>> +    /*
>> +     * When CR4.SMAP is enabled, instructions which have a side effect of
>> +     * accessing the system data structures (e.g. mov to %ds accessing the
>> +     * LDT/GDT, or int $n accessing the IDT) are known as implicit 
>> supervisor
>> +     * accesses.
>> +     *
>> +     * The distinction between implicit and explicit accesses form part of 
>> the
>> +     * determination of access rights, controlling whether the access is
>> +     * successful, or raises a #PF.
>> +     *
>> +     * Unfortunately, the processor throws away the implicit/explicit
>> +     * distinction and does not provide it to the pagefault handler
>> +     * (i.e. here.) in the #PF error code.  Therefore, we must try to
>> +     * reconstruct the lost state so it can be fed back into our pagewalk
>> +     * through the guest tables.
>> +     *
>> +     * User mode accesses are easy to reconstruct:
>> +     *
>> +     *   If we observe a cpl3 data fetch which was a supervisor walk, this
>> +     *   must have been an implicit access to a system table.
>> +     *
>> +     * Supervisor mode accesses are not easy:
>> +     *
>> +     *   In principle, we could decode the instruction under %rip and have 
>> the
>> +     *   instruction emulator tell us if there is an implicit access.
>> +     *   However, this is racy with other vcpus updating the pagetable or
>> +     *   rewriting the instruction stream under our feet.
>> +     *
>> +     *   Therefore, we do nothing.  (If anyone has a sensible suggestion for
>> +     *   how to distinguish these cases, xen-devel@ is all ears...)
>> +     *
>> +     * As a result, one specific corner case will fail.  If a guest OS with
>> +     * SMAP enabled ends up mapping a system table with user mappings, sets
>> +     * EFLAGS.AC to allow explicit accesses to user mappings, and implicitly
>> +     * accesses the user mapping, hardware and the shadow code will disagree
>> +     * on whether a #PF should be raised.
>> +     *
>> +     * Hardware raises #PF because implicit supervisor accesses to user
>> +     * mappings are strictly disallowed.  As we can't reconstruct the 
>> correct
>> +     * input, the pagewalk is performed as if it were an explicit access,
>> +     * which concludes that the access should have succeeded and the shadow
>> +     * pagetables need modifying.  The shadow pagetables are modified (to 
>> the
>> +     * same value), and we re-enter the guest to re-execute the instruction,
>> +     * which causes another #PF, and the vcpu livelocks, unable to make
>> +     * forward progress.
> What you're saying is that if an attacker manages to trigger this
> behavior, then it's probably a mistake on her part; she was trying to do
> a full privilege escalation and accidentally screwed something up and
> turned it into a DoS?

This livelock can only occur if a system table is mapped with user
mappings.  This is not a situation which an OS would ever set up (other
than XTF for testing), as it means userspace can write to the GDT and
trivially root the OS.

It is certainly possible that userspace could use a vulnerability in the
OS to make such a change.  (Chances are that, if it could do that, there
are more interesting things it could do.)

Even if such a situation was set up, the livelock can only be triggered
in supervisor mode, via an implicit access, while userspace accesses are
whitelisted, which isn't a situation which will occur accidentally.

If an attacker had engineered this situation, it would be easy to
sidestep around the livelock.

>
>> +     * In practice, this is tolerable because no OS would deliberately
>> +     * construct such a corner case, as doing so would mean it is trivially
>> +     * root-able by its own userspace.
> Just to point out, the problem with 'deliberately' is that the whole
> point of SMAP is to protect an OS from its own errors.

I don't understand your point here.  The point of 'deliberately' is to
suggest that the only way a kernel would ever get into this situation is
if it had made a mistake.

There is no plausible situation (other than testing) where such a
situation would be deliberately constructed.

>  :-)  But I agree
> that at first blush, the scenario above would be pretty difficult to do
> accidentally.  (And I certainly don't have any better ideas.)
>
> Would this perhaps be better as:
>
> "In practice, this is tolerable because it is difficult to imagine a
> scenario in which an OS would accidentally construct such a corner case;
> and if it did, SMAP would not actually protect it from being trivially
> rooted by userspace unless the attacker made a mistake and accidentally
> triggered the path described here."

The issue here is that there are plenty of ways to accidentally
construct this situation during development.  All it takes is a typo in
the pagetable code for example.

My point is that in a production OS, it isn't going to occur for a
properly configured system.  If it were to occur in a production system,
then there is definitely a security-relevant bug which has been tickled.

>
> But that's just a suggestion.  Either way:
>
> Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxxx>
>

How about:

"In practice, this is tolerable.  No production OS will deliberately
construct this corner case (as doing so would mean that a system table
is directly accessable to userspace, and the OS is trivially rootable.)
If this corner case comes about accidentally, then a security-relevant
bug has been tickled."

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.