[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [VMI] Possible race-condition in altp2m APIs


  • To: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Tue, 28 May 2019 20:58:23 -0700
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@xxxxxxxxxx; spf=Pass smtp.mailfrom=Andrew.Cooper3@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxxxxxxxxxxxxx
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Mathieu Tarral <mathieu.tarral@xxxxxxxxxxxxxx>
  • Delivery-date: Wed, 29 May 2019 03:59:01 +0000
  • Ironport-sdr: fNozD1/9jngs4534ZBKHYrD68+tGaUoqxTftYVIWjOXynXxd596hfylWhrJsFpSy8kMZE6YcWZ wwmUv9yXRjQyGHO+iCojoYlq6bG4O/jNTtHAlPOxjok/vzmCkfboqf87XeXkVpHfkFUTAGUa38 DJMBewJI+rq8ILNEpN4e+yyFWrt41WQDGsoqGJq0Zrd6lyM9Wx46nvh5gVQOaf8uyA2ZFDhf2a BAxlt7oiyqjGRwZRLkrMLk/t24B1XqufSU6S2HHlUmDwoOjcm6detJ+LM0TZ8PnyqkfT4ypyFm Z9I=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 29/05/2019 02:34, Tamas K Lengyel wrote:
>> And some questions.
>>
>> 1) I'm guessing the drakvuf_inject_trap(drakvuf, 0x293e6a0, 0) call is 
>> specific to the exact windows kernel in use?
>>
>> 2) In vmi_init(), what is the purpose of fmask and zero_page_gfn?  You add 
>> one extra gfn to the guest, called zero_page, and fill it with 1's from 
>> fmask.
>>
>> 3) You create two altp2m's, but both have the same default access.  Is this 
>> deliberate, or a bug?  If deliberate, why?
>>
>> Finally, and probably the source of the memory corruption...
>>
>> 4) When injecting a trap, you allocate a new gfn, memcpy() the contents and 
>> insert a 0xcc (so far so good).  You then remap the executable view to point 
>> at the new gfn with a breakpoint in (fine), and remap the readable view to 
>> point at the zero_page, which is full of 1's (uh-oh).
>>
>> What is this final step trying to achieve?  It guarantees that patch-guard 
>> will eventually notice and BSOD your VM for critical structure corruption.  
>> The read-only view needs to point to the original gfn with only read 
>> permissions, so when Windows reads the gfn back, it sees what it expects.  
>> You also need to prohibit writes to either gfn so you can spot writes 
>> (unlikely in this case but important for general introspection) so you can 
>> propagate the change to both copies.
> I can answer these questions based on how I've set it up in DRAKVUF
> (haven't checked Mathiue's implementation and I won't be able for a
> while as I'm traveling at the moment). The reason we have multiple EPT
> views is to make it all multi-vCPU safe. Swapping the views around can
> be done per-vCPU and we don't have to remove/reapply breakpoints all
> the time and pause the entire domain while doing so.

Yup - makes sense.

> There are three views being used: the default (hostp2m); the
> execute-view which is active by default and has the remapped
> shadow-copies of the pages with breakpoints injected at the very end
> of the guests' physmap; and the read-only view that is only used when
> someone is trying to read the actual address of a shadow-copy at the
> end of the physmap (ie. not via the remapped gfn).

Perhaps the terminology could be improved then, seeing as the main view
isn't really "execute restricted".  It is only X-- for the few remapped
gfns, and XWR for the vast majority.

Inside the X-- gfns, there are one (or more) 0xcc breakpoints which can
be hit, which trap out to the VMI agent.

(Following Mathieu's code) on encountering this, we switch altp2m back
to the hostp2m, and enable MTF.  We let one instruction execute,
trapping out on the MTF signal, and switch back to the default view (1)
which is fractionally X-- restricted.

Now - this doesn't cope with the case where the 0xcc'd instruction was a
write into one of the X-- pages, because switching back to the hostp2m
gives full XWR permissions to everything, including the VMI-additional
pages. 

However, given the nature of the breakpoint position, it is almost
certainly intercepting a benign stack operation in this specific case. 
It would be useful to disassemble the head of NtCreateUserProcess and
confirm.

> The read-only view has all shadow-copy gfn's remapped to that one page
> full of 1's, because if you read random large gfn's in the guests'
> memory space that's what Xen's emulator returns. It is called
> zero_page because I originally guessed that those pages should be all
> 0 but it turned out I was wrong. Just haven't change the name of it
> yet. This page is there because we want to avoid someone being able to
> find out that there are shadow pages present. It would be quite
> obvious something is "odd" when you can find copies of the Windows
> kernel memory pages at the end of the memory space. So the shadow
> pages' real GFN mem_access is restricted in the execute view, which
> allows us to switch to the read-only view with MTF enabled and then
> back afterwards. That way the shadow pages are not visible to the
> guest, if someone tries to read them they return the same value and
> behave the same as all other unmapped gfn's in that memory region.

Ahh ok.  Yes - write-discard/read ~0 is a staple of "nothing present",
both in IO and MMIO space on x86.  In which case a better name would be
sink_page or similar.

Also, I see now that that is what Mathieu's code is doing (even though
this view isn't used at all, so far as I can tell), so consider the
question answered, and we're back to square 1 on the BSOD.

As identified before, it needs to be only ever mapped read-only, because
sinking real writes into it would be a BadThing(tm).  We do actually
have a p2m_type_ro which would hopefully cause emulated instructions to
DTRT as well, which should be faster than sending a write event all the
way to the VMI agent.

> So since the read-only view with all the 1's is rarely used, let's
> talk about why patchguard can't notice changes to the kernel:

Well... In this case it really is.  We can certainly talk about why
patchguard *shouldn’t* notice :)

> the execute-view has all pages that were breakpointed remapped and marked
> execute-only. When patchguard tries to read these pages, the view is
> swapped back to the hostp2m with MTF enabled. Then in the MTF callback
> the view is swapped back to the execute-view. This means that
> patchguard only ever reads the original page from the hostp2m. If the
> page is being written to, the same dance happens with the addition of
> the whole page being re-copied to the shadow location and the
> breakpoints being reapplied on the shadow copy. This copy happens
> while the whole domain is paused to avoid race-condition.
>
> I hope this makes sense.

The dance with the read-only view doesn't happen in the simplified case,
but as both you and I have noticed, there looks to be issues with the
page permissions which are probably confounding the problems.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.