Xen project Mailing List

Re: [Xen-devel] [PATCH v7 27/32] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs

To: Roger Pau Monne <roger.pau@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Mon, 5 Oct 2015 11:28:00 +0100

Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>

Delivery-date: Mon, 05 Oct 2015 10:28:42 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 02/10/15 16:48, Roger Pau Monne wrote: > Allow the usage of the VCPUOP_initialise, VCPUOP_up, VCPUOP_down and > VCPUOP_is_up hypercalls from HVM guests. > > This patch introduces a new structure (vcpu_hvm_context) that should be used > in conjuction with the VCPUOP_initialise hypercall in order to initialize > vCPUs for HVM guests. > > Signed-off-by: Roger Pau MonnÃ <roger.pau@xxxxxxxxxx> > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: Ian Campbell <ian.campbell@xxxxxxxxxx> > Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx> > --- > Changes since v6: > - Add comments to clarify some initializations. > - Introduce a generic default_initialize_vcpu that's used to initialize a > ARM vCPU or a x86 PV vCPU. > - Move the undef of the SEG macro. > - Fix the size of the eflags register, it should be 32bits. > - Add a comment regarding the value of the 12-15 bits of the _ar fields. > - Remove the 16bit strucutre, the 32bit one can be used to start the cpu in > real mode. > - Add some sanity checks to the values passed in. > - Add paddings to vcpu_hvm_context so the layout on 32/64bits is the same. > - Add support for the compat version of VCPUOP_initialise. > > Changes since v5: > - Fix a coding style issue. > - Merge the code from wip-dmlite-v5-refactor by Andrew in order to reduce > bloat. > - Print the offending %cr3 in case of error when using shadow. > - Reduce the scope of local variables in arch_initialize_vcpu. > - s/current->domain/v->domain/g in arch_initialize_vcpu. > - Expand the comment in public/vcpu.h to document the usage of > vcpu_hvm_context for HVM guests. > - Add myself as the copyright holder for the public hvm_vcpu.h header. > > Changes since v4: > - Don't assume mode is 64B, add an explicit check. > - Don't set TF_kernel_mode, it is only needed for PV guests. > - Don't set CR0_ET unconditionally. > --- > xen/arch/x86/domain.c | 185 > ++++++++++++++++++++++++++++++++++++++ > xen/arch/x86/hvm/hvm.c | 8 ++ > xen/common/compat/domain.c | 71 +++++++++++---- > xen/common/domain.c | 56 +++++++++--- > xen/include/Makefile | 1 + > xen/include/asm-x86/domain.h | 3 + > xen/include/public/hvm/hvm_vcpu.h | 144 +++++++++++++++++++++++++++++ > xen/include/public/vcpu.h | 6 +- > xen/include/xlat.lst | 3 + > 9 files changed, 448 insertions(+), 29 deletions(-) > create mode 100644 xen/include/public/hvm/hvm_vcpu.h > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index a3b1c9b..af5feea 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -37,6 +37,7 @@ > #include <xen/wait.h> > #include <xen/guest_access.h> > #include <public/sysctl.h> > +#include <public/hvm/hvm_vcpu.h> > #include <asm/regs.h> > #include <asm/mc146818rtc.h> > #include <asm/system.h> > @@ -1176,6 +1177,190 @@ int arch_set_info_guest( > #undef c > } > > +/* Called by VCPUOP_initialise for HVM guests. */ > +int arch_set_info_hvm_guest(struct vcpu *v, vcpu_hvm_context_t *ctx) > +{ > + struct cpu_user_regs *uregs = &v->arch.user_regs; > + struct segment_register cs, ds, ss, es, tr; > + > + switch ( ctx->mode ) > + { > + default: > + return -EINVAL; > + > + case VCPU_HVM_MODE_32B: > + { > + const struct vcpu_hvm_x86_32 *regs = &ctx->cpu_regs.x86_32; > + uint32_t limit; > + > +#define SEG(s, r) \ > + (struct segment_register){ .sel = 0, .base = (r)->s ## _base, \ > + .limit = (r)->s ## _limit, .attr.bytes = (r)->s ## _ar } > + cs = SEG(cs, regs); > + ds = SEG(ds, regs); > + ss = SEG(ss, regs); > + es = SEG(es, regs); > + tr = SEG(tr, regs); > +#undef SEG > + > + /* Basic sanity checks. */ > + if ( cs.attr.fields.pad != 0 || ds.attr.fields.pad != 0 || > + ss.attr.fields.pad != 0 || es.attr.fields.pad != 0 || > + tr.attr.fields.pad != 0 ) > + { > + gprintk(XENLOG_ERR, "Attribute bits 12-15 of the segments are > not null\n"); I would use 'zero' as opposed to 'null' here. There is nothing to do with pointers here. > + return -EINVAL; > + } > + > + limit = cs.limit * (cs.attr.fields.g ? PAGE_SIZE : 1); This will overflow in the common case. Calculation of the limit is a little awkward. I believe this should do: limit = cs.limit if ( cs.attr.fields.g ) limit = (limit << 12) | 0xfff; In the case that g is set and cs is a flat segment, limit should have the value ~0U, rather than 0 which is what your calculation will achieve. > + if ( regs->eip > limit ) > + { > + gprintk(XENLOG_ERR, "EIP address is outside of the CS limit\n"); In all cases, please print out the values, to make the error message more helpful. e.g. "EIP (%08x) outside CS limit (%08x)" > + return -EINVAL; > + } > + > + if ( ds.attr.fields.dpl > cs.attr.fields.dpl ) > + { > + gprintk(XENLOG_ERR, "DPL of DS is greater than DPL of CS\n"); > + return -EINVAL; > + } > + > + if ( ss.attr.fields.dpl != cs.attr.fields.dpl ) > + { > + gprintk(XENLOG_ERR, "DPL of SS is different than DPL of CS\n"); > + return -EINVAL; > + } > + > + if ( es.attr.fields.dpl > cs.attr.fields.dpl ) > + { > + gprintk(XENLOG_ERR, "DPL of ES is greater than DPL of CS\n"); > + return -EINVAL; > + } > + > + if ( ((regs->efer & EFER_LMA) && !(regs->efer & EFER_LME)) || > + ((regs->efer & EFER_LME) && !(regs->efer & EFER_LMA)) ) This simplifies to ( (!!(regs->efer & EFER_LMA)) ^ (!!(regs->efer & EFER_LME)) ) > + { > + gprintk(XENLOG_ERR, "EFER.LMA and EFER.LME must be both set\n"); And this should say "both the same", rather than both set. Having said this, I still don't think it is sensible to require that LMA is set, seeing as it is strictly a read-only bit in EFER. I would suggest keying on LME alone, and automatically ORing in LMA, which matches the behaviour of hardware more closely. > + return -EINVAL; > + } > + > + uregs->rax = regs->eax; > + uregs->rcx = regs->ecx; > + uregs->rdx = regs->edx; > + uregs->rbx = regs->ebx; > + uregs->rsp = regs->esp; > + uregs->rbp = regs->ebp; > + uregs->rsi = regs->esi; > + uregs->rdi = regs->edi; > + uregs->rip = regs->eip; > + uregs->rflags = regs->eflags; > + > + v->arch.hvm_vcpu.guest_cr[0] = regs->cr0; > + v->arch.hvm_vcpu.guest_cr[3] = regs->cr3; > + v->arch.hvm_vcpu.guest_cr[4] = regs->cr4; > + v->arch.hvm_vcpu.guest_efer = regs->efer; > + } > + break; > + > + case VCPU_HVM_MODE_64B: > + { > + const struct vcpu_hvm_x86_64 *regs = &ctx->cpu_regs.x86_64; > + > + /* Basic sanity checks. */ > + if ( !is_canonical_address(regs->rip) ) > + { > + gprintk(XENLOG_ERR, "RIP contains a non-canonical address\n"); > + return -EINVAL; > + } > + > + if ( !(regs->cr0 & X86_CR0_PG) ) > + { > + gprintk(XENLOG_ERR, "CR0 doesn't have paging enabled\n"); > + return -EINVAL; > + } > + > + if ( !(regs->cr4 & X86_CR4_PAE) ) > + { > + gprintk(XENLOG_ERR, "CR4 doesn't have PAE enabled\n"); > + return -EINVAL; > + } > + > + if ( (regs->efer & (EFER_LME | EFER_LMA)) != (EFER_LME | EFER_LMA) ) > + { > + gprintk(XENLOG_ERR, "EFER doesn't have LME or LMA enabled\n"); > + return -EINVAL; > + } > + > + uregs->rax = regs->rax; > + uregs->rcx = regs->rcx; > + uregs->rdx = regs->rdx; > + uregs->rbx = regs->rbx; > + uregs->rsp = regs->rsp; > + uregs->rbp = regs->rbp; > + uregs->rsi = regs->rsi; > + uregs->rdi = regs->rdi; > + uregs->rip = regs->rip; > + uregs->rflags = regs->rflags; > + > + v->arch.hvm_vcpu.guest_cr[0] = regs->cr0; > + v->arch.hvm_vcpu.guest_cr[3] = regs->cr3; > + v->arch.hvm_vcpu.guest_cr[4] = regs->cr4; > + v->arch.hvm_vcpu.guest_efer = regs->efer; > + > +#define SEG(b, l, a) \ > + (struct segment_register){ .sel = 0, .base = (b), .limit = (l), \ > + .attr.bytes = (a) } > + cs = SEG(0, ~0u, 0xa9b); /* 64bit code segment. */ > + ds = ss = es = SEG(0, ~0u, 0xc93); > + tr = SEG(0, 0x67, 0x8b); /* 64bit TSS (busy). */ > +#undef SEG I would be tempted to get rid of this macro entirely. The other macro was to hide all the regs-> references, but this is entirely from constants. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.