Xen project Mailing List

Re: [Xen-devel] [VMI] Possible race-condition in altp2m APIs

To: Mathieu Tarral <mathieu.tarral@xxxxxxxxxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Tue, 28 May 2019 17:49:33 -0700

Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@xxxxxxxxxx; spf=Pass smtp.mailfrom=Andrew.Cooper3@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxxxxxxxxxxxxx

Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>

Delivery-date: Wed, 29 May 2019 00:49:51 +0000

Ironport-sdr: k7a23iFshuU7XbgUqYSSVHauixk/GxHMw2B89GbJipzUNSIIVjuP4jkvjCWn3sZK4WzooWkEz5 KCMjNe8rMTBYjXTFb/QHGoxs0KNymamEka5cuNz2fDWf4eBfxn0V98t6oC4xh2Kfa64MZs1Jg4 iT5tSDynKcd4KkgsLPV8oHhYe1pGHRlxA1pYRwujHViC3u+oYiS8k2p3iOKzFHjv3j2UfwGtKq fLZ/iaEzvsy6YpFMpEDVEJKghasM8siGdGQAkI9MWgPtfEyWmGki1/dJsmk+vaLqYppaKkO7QX 83I=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Openpgp: preference=signencrypt

On 28/05/2019 13:33, Mathieu Tarral wrote:

Hi Andrew,

The bug is still here, so we can exclude a microcode issue.

Good - that is one further angle excluded.  Always make sure you are
running with up-to-date microcode, but it looks like we back to
investigating a logical bug in libvmi or Xen.


I reimplemented a small test, without the Drakvuf/Libvmi layers, that will inject traps on one API in Windows (NtCreateUserProcess),
in the same way that Drakvuf does.

I did some quick testing yesterday, with a Python script that was repeatedly
starting the binary to monitor the API, and at the same time starting Ansible
to run "c:\Windows\system32\reg.exe /?" via WinRM, to trigger some process creation.

The traps are working, I see the software breakpoint hit, switching to the default
view for singlestepping, and switching back to the execution view, so that's already good.

After a series of tests on 1 or 4 VCPUs, my domain end up in 2 possible states:
- frozen: the mouse doesn't move: so I would guess the VCPU are blocked.

I'm calling the xc_(un)pause_domain APIs multiple times when I write to the shadow copies,
but It's always synchronous, so I doubt that they interfered and "paused" the domain.

xc_{,un}pause_domain() calls are reference counted. Calling unpause too many times should be safe (from a refcount point of view), and should fail the hypercall with -EINVAL.


Also, the log output I have before I detect that Ansible failed to execute is that the resume succeded and
Xen is ready to process VMI events.

- BSOD: that's the second possibility, apparently I'm corrupting critical data structure in the operating system,
and the Windbg analysis is inconclusive, so I can't tell much.

Either way, I can't execute this test sequentially 10 000 times without a crash.

Ok good - this is a far easier place to start debugging from.

-> Could you look at the implementation, and tell me if I misused the APIs somewhere ?
https://gist.github.com/mtarral/d99ce5524cfcfb5290eaa05702c3e8e7

Some observations.

1) In xen_pause_vm(), you do an xc_domain_getinfo(). First of all, the API is crazy, so you also need to check "|| info.domid != domid" in your error condition, but that shouldn't be an issue here as the domid isn't changing.

Furthermore, the results of xc_domain_getinfo() are stale before the hypercall returns, so it is far less useful than it appears.

I'm afraid that the only safe way to issue pause/unpauses is to know that you've reference counted your own correctly. All entities in dom0 with privilege can fight over each others references, because there is nothing Xen can use to distinguish the requests.

2) You allocate a libxl context but do nothing with it. That can all go, along with the linkage against libxl. Also, you don't need to create a logger like that. Despite being utterly unacceptable behaviour for a library, it is the default by passing NULL in xc_interface_open().

3) A malloc()/memset() pair is more commonly spelt calloc()

And some questions.

1) I'm guessing the drakvuf_inject_trap(drakvuf, 0x293e6a0, 0) call is specific to the exact windows kernel in use?

2) In vmi_init(), what is the purpose of fmask and zero_page_gfn? You add one extra gfn to the guest, called zero_page, and fill it with 1's from fmask.

3) You create two altp2m's, but both have the same default access. Is this deliberate, or a bug? If deliberate, why?

Finally, and probably the source of the memory corruption...

4) When injecting a trap, you allocate a new gfn, memcpy() the contents and insert a 0xcc (so far so good). You then remap the executable view to point at the new gfn with a breakpoint in (fine), and remap the readable view to point at the zero_page, which is full of 1's (uh-oh).

What is this final step trying to achieve? It guarantees that patch-guard will eventually notice and BSOD your VM for critical structure corruption. The read-only view needs to point to the original gfn with only read permissions, so when Windows reads the gfn back, it sees what it expects. You also need to prohibit writes to either gfn so you can spot writes (unlikely in this case but important for general introspection) so you can propagate the change to both copies.


I used the compat APIs, like Drakvuf does.

@Tamas, if you could check the traps implementation.

You also have stress-test.py, which is the small test suite that I used, and
the screenshot showing the stdout preceding a test failure,
when Ansible couldn't contact WinRM service because the domain was frozen.

Note: I stole some code from libvmi, to handle page read/write in Xen.

PS: in the case where the domain is frozen, and I destroy the domain, a (null) entry will remain
in xl list, despite that my stress-test.py process is already dead.

I have 4 of these entries in my xl list right now.

That's almost certainly a reference not being dropped on a page. Can you run `xl debug-keys q` and paste the resulting analysis which will be visible in `xl dmesg`?

It is probably some missing cleanup in the altp2m code.

~Andrew

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.