[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH for-4.5 v6 04/16] xen: Add vmware_port support
- To: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Don Slutz <dslutz@xxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx
- From: Don Slutz <dslutz@xxxxxxxxxxx>
- Date: Wed, 24 Sep 2014 12:48:31 -0400
- Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Eddie Dong <eddie.dong@xxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Aravind Gopalakrishnan <Aravind.Gopalakrishnan@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
- Delivery-date: Wed, 24 Sep 2014 16:49:10 +0000
- List-id: Xen developer discussion <xen-devel.lists.xen.org>
On 09/24/14 12:01, George Dunlap wrote:
On 09/20/2014 07:07 PM, Don Slutz wrote:
This includes adding is_vmware_port_enabled
This is a new domain_create() flag, DOMCRF_vmware_port. It is
passed to domctl as XEN_DOMCTL_CDF_vmware_port.
This enables limited support of VMware's hyper-call.
This is both a more complete support then in currently provided by
QEMU and/or KVM and less. The missing part requires QEMU changes
and has been left out until the QEMU patches are accepted upstream.
VMware's hyper-call is also known as VMware Backdoor I/O Port.
Note: this support does not depend on vmware_hw being non-zero.
Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
to port 0x5658 specially. Note: since many operations return data
in EAX, "in (%dx),%eax" is the one to use. The other lengths like
"in (%dx),%al" will still do things, only AL part of EAX will be
changed. For "out %eax,(%dx)" of all lengths, EAX will remain
unchanged.
Also this instruction is allowed to be used from ring 3. To
support this the vmexit for GP needs to be enabled. I have not
fully tested that nested HVM is doing the right thing for this.
An open source example of using this is:
http://open-vm-tools.sourceforge.net/
Which only uses "inl (%dx)". Also
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
The support included is enough to allow VMware tools to install in a
HVM domU.
For a debug=y build there is a new command line option
vmport_debug=. It enabled output to the console of various
stages of handling the "in (%dx)" instruction.
Signed-off-by: Don Slutz <dslutz@xxxxxxxxxxx>
[snip]
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7b1dfe6..e2e4aad 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -510,6 +510,8 @@ int arch_domain_create(struct domain *d, unsigned
int domcr_flags)
d->arch.hvm_domain.mem_sharing_enabled = 0;
d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
+ d->arch.hvm_domain.is_vmware_port_enabled =
+ (domcr_flags & DOMCRF_vmware_port);
Should this be "!!(domcr..."?
I do not think it is needed, but happy to change to that.
diff --git a/xen/arch/x86/hvm/vmware/vmport.c
b/xen/arch/x86/hvm/vmware/vmport.c
new file mode 100644
index 0000000..811c303
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -0,0 +1,326 @@
+/*
+ * HVM VMPORT emulation
+ *
+ * Copyright (C) 2012 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
<http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <asm/hvm/hvm.h>
+#include <asm/hvm/support.h>
+#include <asm/hvm/vmport.h>
+
+#include "backdoor_def.h"
+#include "guest_msg_def.h"
+
+#define MAX_INST_LEN 15
+
+#ifndef NDEBUG
+unsigned int opt_vmport_debug __read_mostly;
+integer_param("vmport_debug", opt_vmport_debug);
+#endif
+
+/* More VMware defines */
+
+#define VMWARE_GUI_AUTO_GRAB 0x001
+#define VMWARE_GUI_AUTO_UNGRAB 0x002
+#define VMWARE_GUI_AUTO_SCROLL 0x004
+#define VMWARE_GUI_AUTO_RAISE 0x008
+#define VMWARE_GUI_EXCHANGE_SELECTIONS 0x010
+#define VMWARE_GUI_WARP_CURSOR_ON_UNGRAB 0x020
+#define VMWARE_GUI_FULL_SCREEN 0x040
+
+#define VMWARE_GUI_TO_FULL_SCREEN 0x080
+#define VMWARE_GUI_TO_WINDOW 0x100
+
+#define VMWARE_GUI_AUTO_RAISE_DISABLED 0x200
+
+#define VMWARE_GUI_SYNC_TIME 0x400
+
+/* When set, toolboxes should not show the cursor options page. */
+#define VMWARE_DISABLE_CURSOR_OPTIONS 0x800
+
+void vmport_register(struct domain *d)
+{
+ register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
+}
+
+int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t
*val)
+{
+ struct cpu_user_regs *regs = guest_cpu_user_regs();
+ uint32_t cmd = regs->rcx & 0xffff;
+ uint32_t magic = regs->rax;
+ int rc = X86EMUL_OKAY;
+
+ if ( magic == BDOOR_MAGIC )
+ {
+ uint64_t saved_rax = regs->rax;
+ uint64_t value;
+
+ VMPORT_DBG_LOG(VMPORT_LOG_TRACE,
+ "VMware trace dir=%d bytes=%u ip=%"PRIx64"
cmd=%d ax=%"
+ PRIx64" bx=%"PRIx64" cx=%"PRIx64"
dx=%"PRIx64" si=%"
+ PRIx64" di=%"PRIx64"\n", dir, bytes,
+ regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
+ regs->rdx, regs->rsi, regs->rdi);
+ switch ( cmd )
+ {
+ case BDOOR_CMD_GETMHZ:
+ /* ... */
+ regs->rbx = BDOOR_MAGIC;
+ regs->rax = current->domain->arch.tsc_khz / 1000;
+ break;
+ case BDOOR_CMD_GETVERSION:
+ /* ... */
+ regs->rbx = BDOOR_MAGIC;
+ /* VERSION_MAGIC */
+ regs->rax = 6;
+ /* Claim we are an ESX. VMX_TYPE_SCALABLE_SERVER */
+ regs->rcx = 2;
+ break;
+ case BDOOR_CMD_GETSCREENSIZE:
+ /* We have no screen size */
+ regs->rax = 0;
+ break;
+ case BDOOR_CMD_GETHWVERSION:
+ /* vmware_hw */
+ regs->rax = 0;
+ if ( is_hvm_vcpu(current) )
+ {
+ struct hvm_domain *hd =
¤t->domain->arch.hvm_domain;
+
+ regs->rax = hd->params[HVM_PARAM_VMWARE_HW];
+ }
+ if ( !regs->rax )
+ regs->rax = 4; /* Act like version 4 */
+ break;
+ case BDOOR_CMD_GETHZ:
+ value = current->domain->arch.tsc_khz * 1000;
+ /* apic-frequency (bus speed) */
+ regs->rcx = (uint32_t)(1000000000ULL / APIC_BUS_CYCLE_NS);
+ /* High part of tsc-frequency */
+ regs->rbx = (uint32_t)(value >> 32);
+ /* Low part of tsc-frequency */
+ regs->rax = value;
Either the comment or the code here is wrong -- this is clearly not
the lower 32 bits, at least on 64-bit guests. :-)
Opps, it should have included the (uint32_t) cast also. Will fix.
If the code is right -- that is, if a 32-bit guest find this truncated
automatically, but a 64-bit guest find all 64 bits here (and thus not
have to reconstruct it) -- you should make the comment more
informative; for example:
/* On 32-bit systems this will be the lower 32 bits. 64-bit systems
can just use the full value from rax. */
(Word-wrapped, of course.)
Hmm -- looks like regs->rax will be clipped to 32 bits for a 4-byte IO
read? In which case the comment here should reflect this, but you
have the same basic issue for BDOOR_CMD_GETTIMEFUL regs->rdx (which
will not be clipped, I don't think).
It will also be "adjusted" for 2 or 1 byte IO. rdx does not get clipped
later, but is clipped to 32bits (see below).
+ break;
+ case BDOOR_CMD_GETTIME:
+ value = get_localtime_us(current->domain);
+ /* hostUsecs */
+ regs->rbx = (uint32_t)(value % 1000000UL);
+ /* hostSecs */
+ regs->rax = value / 1000000ULL;
+ /* maxTimeLag */
+ regs->rcx = 0;
+ break;
+ case BDOOR_CMD_GETTIMEFULL:
+ value = get_localtime_us(current->domain);
+ /* ... */
+ regs->rax = BDOOR_MAGIC;
+ /* hostUsecs */
+ regs->rbx = (uint32_t)(value % 1000000UL);
+ /* High part of hostSecs */
+ regs->rsi = (uint32_t)((value / 1000000ULL) >> 32);
+ /* Low part of hostSecs */
+ regs->rdx = (uint32_t)(value / 1000000ULL);
Same here.
But the (uint32_t) does make it just 32bits.
+ /* maxTimeLag */
+ regs->rcx = 0;
+ break;
+ case BDOOR_CMD_GETGUIOPTIONS:
+ regs->rax = VMWARE_GUI_AUTO_GRAB | VMWARE_GUI_AUTO_UNGRAB |
+ VMWARE_GUI_AUTO_RAISE_DISABLED | VMWARE_GUI_SYNC_TIME |
+ VMWARE_DISABLE_CURSOR_OPTIONS;
+ break;
+ case BDOOR_CMD_SETGUIOPTIONS:
+ regs->rax = 0x0;
+ break;
+ default:
+ VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
+ "VMware bytes=%d dir=%d cmd=%d",
+ bytes, dir, cmd);
+ break;
+ }
+ VMPORT_DBG_LOG(VMPORT_LOG_VMWARE_AFTER,
+ "VMware after ip=%"PRIx64" cmd=%d
ax=%"PRIx64" bx=%"
+ PRIx64" cx=%"PRIx64" dx=%"PRIx64"
si=%"PRIx64" di=%"
+ PRIx64"\n",
+ regs->rip, cmd, regs->rax, regs->rbx, regs->rcx,
+ regs->rdx, regs->rsi, regs->rdi);
+ if ( dir == IOREQ_READ )
+ {
+ switch ( bytes )
+ {
+ case 1:
+ regs->rax = (saved_rax & 0xffffff00) | (regs->rax &
0xff);
+ break;
+ case 2:
+ regs->rax = (saved_rax & 0xffff0000) | (regs->rax &
0xffff);
+ break;
+ case 4:
+ regs->rax = (uint32_t)regs->rax;
+ break;
+ }
+ *val = regs->rax;
+ }
+ else
+ regs->rax = saved_rax;
+ }
+ else
+ {
+ rc = X86EMUL_UNHANDLEABLE;
+ VMPORT_DBG_LOG(VMPORT_LOG_ERROR,
+ "Not VMware %x vs %x; ip=%"PRIx64" ax=%"PRIx64
+ " bx=%"PRIx64" cx=%"PRIx64" dx=%"PRIx64"
si=%"PRIx64
+ " di=%"PRIx64"",
+ magic, BDOOR_MAGIC, regs->rip, regs->rax,
regs->rbx,
+ regs->rcx, regs->rdx, regs->rsi, regs->rdi);
+ }
+
+ return rc;
+}
+
+int vmport_gp_check(struct cpu_user_regs *regs, struct vcpu *v,
+ unsigned long *inst_len, unsigned long inst_addr,
+ unsigned long ei1, unsigned long ei2)
I've wondered, in another e-mail, whether it would make more sense to
try to re-use the normal Xen emulation code, instead of duplicating
its IO instruction decoding stuff here.
Since this is only called on a #GP, I do not what to attempt to emulate
the instruction by the normal way. In fact the normal Xen emulation
should say "do a #GP", not do the VMware hypercall.
I will say that I had not looked into getting the normal Xen emulation
"fixed" for this case. In all my testing, I have not see this issue.
With the patch:
Subject: [Xen-devel] [PATCH 5/6] x86/hvm: Forced Emulation Prefix for debug
builds of Xen
Message-ID: <1411484611-31027-6-git-send-email-andrew.cooper3@xxxxxxxxxx>
I need to look into this.
I think I probably wouldn't make that a blocker for acceptance,
though. However...
Thanks.
+{
+ ASSERT(v->domain->arch.hvm_domain.is_vmware_port_enabled);
At the moment I think this ASSERT is misplaced; there are no checks
for this anywhere in the handler path to this point. If at any time
in the future (or for any other reason) #GP exiting gets enabled (for
example, if the introspection stuff wants to be notifed on #GPs),
you'll end up going through this path whether or not
is_vmware_port_enabled is true.
I think you should instead "if(!is_vmware_port_enabled) return" here.
That would effectively isolate these new changes from being able to
introduce security issues for VMs which don't enable vmware_port,
making it less risky to accept as-is.
Ok, it was a return in an older version. Happy to switch back.
[snip]
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index fc1f882..6fe9389 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1102,6 +1102,8 @@ static int construct_vmcs(struct vcpu *v)
v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK
| (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault))
+ | (v->domain->arch.hvm_domain.is_vmware_port_enabled ?
+ (1U << TRAP_gp_fault) : 0)
| (1U << TRAP_no_device);
Probably not mandatory, but it might be nice to pull this bitmap logic
into a function, so that you don't have the code duplication (here and
in vmx_update_guest_cr()).
Ok, Will keep it in mind.
-Don Slutz
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|