[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: x86 instruction emulation backstory?


  • To: Alex Olson <this.is.a0lson@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 14 Apr 2023 20:54:41 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Quo075gM2E9iTDF5hDm7gU4UCvuyM8dFgPdI8ca8564=; b=k2VQSq5wjJrpykZVlYmBYJSxAK4B6BGlyDqF/wSK+Ep96Y0yeTo1aLMY51Vo5ye5KxSDiLVz2bbEE0YkdfjDmfvyIn7vAPLtFwf+QdcqgwMeg+vAZPzl7SPzlxwmRQWGbMPElu0eoam4I7izcG0ojBTr5TNHjj+BFaRN06Tr6jemvcUHePK6k2YU80Q7iKEOArrBdfiB0NPZMjNYxlhYAAm7SiUA2CuASAqxyoJKmJiHi5bEo4enUj3xAPSApfUB4SjDWSKoq8pzM4KOHFX2y6PFShJF4nkz0F9dHWqnDnOktBZrHu/qDgmktpRpDW0iVc6x9YxRKNYZSLYmi1nbTw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JIGLXyDojC4rX/JGQNaeOeLzli6WyYyF3FhiTeAWcOh5pyAp12FD2+bGOurDyp1fnZ56QR5dLCXQbC9i3mtJItA2gKHBpx+1iznwxtGO26MjnxzNcUeqiDd7wUSoySOX9Oir2CDuSM5/CV8emJVbDvXb4lrQxxf4FXS1Qx5z3C5jOHU0gH/dTrTQfjLbXvqISaH8hy5BP8Uqh5muj7yYPVPv7Xgnk1dPwcqD6WOhsko0NpkPVuBZArTfwqRVnd8ouuuplLBr999lj2jY00Bi8GDswKRIOA3AP1mOYFRDLo9OZBv98Pho9B2WrwbDI6lcJRfROK/yHH0uyqJEEuVODw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Delivery-date: Fri, 14 Apr 2023 19:55:18 +0000
  • Ironport-data: A9a23:TqSPj6IQUKjQfAtZFE+R95QlxSXFcZb7ZxGr2PjKsXjdYENSgzcCn GBKWm6AOK6CMzD3c4x/ati18RtUvp7RzoQ1TQFlqX01Q3x08seUXt7xwmUcnc+xBpaaEB84t ZV2hv3odp1coqr0/0/1WlTZhSAgk/rOHvykU7Ss1hlZHWdMUD0mhQ9oh9k3i4tphcnRKw6Ws Jb5rta31GWNglaYCUpJrfPSwP9TlK6q4mhA4gViPaojUGL2zBH5MrpOfcldEFOgKmVkNrbSb /rOyri/4lTY838FYj9yuu+mGqGiaue60Tmm0hK6aYD76vRxjnVaPpIAHOgdcS9qZwChxLid/ jnvWauYEm/FNoWU8AgUvoIx/ytWZcWq85efSZSzXFD6I+QrvBIAzt03ZHzaM7H09c5lWT131 MEWLgxSMEGxm7KrkeqSFeVj05FLwMnDZOvzu1lG5BSAVbMMZ8+GRK/Ho9hFwD03m8ZCW+7EY NYUYiZuaxKGZABTPlAQC9Q1m+LAanvXKmUE7g7K4/VvpTGLlWSd05C0WDbRUvWMSd9YgQCzo WXe8n6iKhobKMae2XyO9XfEaurnxHunCdtMTufonhJsqFi6z2ZUMgAqbxz4+KTggHebdYxnF GVBr0LCqoB3riRHVOLVQx25uziFpVgVA95LFOsS5wSEy66S6AGcbkAUQzgEZNE4ucseQT0xy kTPj97vHSZosrCeVTSa7Lj8kN+pES0cLGtHbylbSwIAuoHnuNtq1kuJSct/GqmoiNGzASv33 z2BsCk5gfMUkNIP0KK4u1vAhlpAu6T0c+L83S2PNkrN0++zTNTNi1CAgbQD0ct9EQ==
  • Ironport-hdrordr: A9a23:/HMDQqFpW808v/ZMpLqE+ceALOsnbusQ8zAXPiFKJCC9F/by/f xG885rtiMc9wxhOk3I9ervBEDiex/hHPxOgbX5VI3KNDUO01HGEGgN1+rfKjTbakjDytI=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14/04/2023 7:33 pm, Alex Olson wrote:
> I've been digging into VMX internals and I see why MMIO emulation pretty much
> requires x86 instruction emulation.  Even the Linux KVM code borrowed Xen's
> emulation...
>
> Thus, I'm trying to understand Xen's x86 emulation implementation...
>
> How was it developed? (x86 instruction handling is incredibly complex!) 
>
> Was it originally part of a general purpose x86 emulator?

Xen's emulator (in this form at least) is 18 years old - March 2005

https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=4c5eeec983495e347c6ab3d40a4a70cdbdfce9af

and it was written from scratch, but you can even see in the context for
x86/traps.c that emulate_privileged_op() predates that.  (We decided to
consolidate down to a single instruction decoder/emulator at the point
that we were maintaining 4 different ad-hoc ones.)

As for development, it's all there in git log if you want to go looking :).

> It looks like it implements more instructions than just ones that can access
> memory, such as "AAM"?  (Why is this)?

All instructions have an implicit memory operand at %rip.  The CPU has
to fetch the opcode bytes from somewhere...  (See Introspection, later)


You've found MMIO, but emulating from a #GP fault was also an important
usecase even back then.  PV guest kernels execute in Ring1 (32bit) or
Ring3 (64bit), therefore cannot use CPL0 instructions.

While PV guests ought to use hypercalls for privileged operations, doing
so completely is very expensive in an existing codebase that you're
trying to port to Xen.  Therefore, Xen will emulate in a few faulting
conditions, so the guest can e.g. execute RDMSR and have it function
correctly (albeit painfully slowly).


More recently, Hypervisor Introspection as a technology opens up a whole
load of interesting cases which want emulation.  A lot of introspection
boils down to removing permissions behind the scenes (e.g. making code
no-execute, or making data read-only) so violations cause an exit to the
hypervisor, and an introspection agent can make a judgement call.  99%
of cases are fine, and should proceed.

But, how do you do this?  You could lift the permissions, but then
malware on other vCPUs now have a window of time where they are free to
make modifications.  So instead you could pause the VM, lift the perms,
singlestep the trapping vCPU, restore them perms, and unpause it.  But
this has terrible performance to start with, and is an O(N^2) perf hit
with then number of vCPUs the VM has.

In practice, it is *far* cheaper to have Xen emulate the instruction,
than it is to play with pausing, perms and singlestepping.

But consider the 1% other case where continuing isn't fine.  One of the
supported options is to "emulate / discard" to try and skip the
instruction without making a real state modification.  This cannot be
done with singlestepping, and has to be done by software somewhere.  As
Xen already has an emulator, it's very easy to use a set of
write_discard() hooks in place of the real ones.


As to the complexity, yes and in truth, Xen's emulator isn't fully an
emulator.

We pretty much emulate all the integer instructions, because most of
them are very simple, but we do not for the vector instructions.  What
we do for vector instructions is better described as decode and replay,
where we reconstruct a modified form of the instruction to operate on
local state, so we can piece together the overall reads and writes
without needing to implement the vector logic itself.

It's also worth saying that for any locked/atomic operations, we have to
issue a real instruction too, because that's the only way to get the
cache coherency behaviour correct.

It is also worth nothing that Xen's emulator isn't complete.  Notably,
noone has implemented IRET for protected mode yet, or inter-privilege
far transfers, and we've got known corner cases (e.g. interrupt shadow
with Mov SS) in need of some work.


We went through a spate of problems where Windows in particular kept
coming up with more and more inventive instructions to use to write into
the emulated VGA framebuffer, and we decided that the emulator should be
as complete as we can reasonably make it.  A consequence of this is that
we have some very interesting and powerful advanced security features.

I hope this helps, or was at least interesting.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.