[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/2] x86/svm: Write the correct %eip into the outgoing task
On Thu, Nov 21, 2019 at 10:15:51PM +0000, Andrew Cooper wrote: > The TASK_SWITCH vmexit has fault semantics, and doesn't provide any NRIPs > assistance with instruction length. As a result, any instruction-induced task > switch has the outgoing task's %eip pointing at the instruction switch caused ^ that > the switch, rather than after it. > > This causes explicit use of task gates to livelock (as when the task returns, > it executes the task-switching instruction again), and any restartable task to > become a nop after its first instantiation (the entry state points at the > ret/iret instruction used to exit the task). > > 32bit Windows in particular is known to use task gates for NMI handling, and > to use NMI IPIs. > > In the task switch handler, distinguish instruction-induced from > interrupt/exception-induced task switches, and decode the instruction under > %rip to calculate its length. > > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > --- > CC: Jan Beulich <JBeulich@xxxxxxxx> > CC: Wei Liu <wl@xxxxxxx> > CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> > CC: Juergen Gross <jgross@xxxxxxxx> > > The implementation of svm_get_task_switch_insn_len() is bug-compatible with > svm_get_insn_len() when it comes to conditional #GP'ing. I still haven't had > time to address this more thoroughly. > > AMD does permit TASK_SWITCH not to be intercepted and, I'm informed does do > the right thing when it comes to a TSS crossing a page boundary. However, it > is not actually safe to leave task switches unintercepted. Any NPT or shadow > page fault, even from logdirty/paging/etc will corrupt guest state in an > unrecoverable manner. > --- > xen/arch/x86/hvm/svm/emulate.c | 55 > +++++++++++++++++++++++++++++++++++ > xen/arch/x86/hvm/svm/svm.c | 46 ++++++++++++++++++++++------- > xen/include/asm-x86/hvm/svm/emulate.h | 1 + > 3 files changed, 92 insertions(+), 10 deletions(-) > > diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c > index 3e52592847..176c25f60d 100644 > --- a/xen/arch/x86/hvm/svm/emulate.c > +++ b/xen/arch/x86/hvm/svm/emulate.c > @@ -117,6 +117,61 @@ unsigned int svm_get_insn_len(struct vcpu *v, unsigned > int instr_enc) > } > > /* > + * TASK_SWITCH vmexits never provide an instruction length. We must always > + * decode under %rip to find the answer. > + */ > +unsigned int svm_get_task_switch_insn_len(struct vcpu *v) > +{ > + struct hvm_emulate_ctxt ctxt; > + struct x86_emulate_state *state; > + unsigned int emul_len, modrm_reg; > + > + ASSERT(v == current); > + hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs()); > + hvm_emulate_init_per_insn(&ctxt, NULL, 0); > + state = x86_decode_insn(&ctxt.ctxt, hvmemul_insn_fetch); > + if ( IS_ERR_OR_NULL(state) ) Maybe crash the guest in this case? Not advancing the instruction pointer in a software induced task switch will create a loop AFAICT? > + return 0; > + > + emul_len = x86_insn_length(state, &ctxt.ctxt); > + > + /* > + * Check for an instruction which can cause a task switch. Any far > + * jmp/call/ret, any software interrupt/exception, and iret. > + */ > + switch ( ctxt.ctxt.opcode ) > + { > + case 0xff: /* Grp 5 */ > + /* call / jmp (far, absolute indirect) */ > + if ( x86_insn_modrm(state, NULL, &modrm_reg) != 3 || > + (modrm_reg != 3 && modrm_reg != 5) ) > + { > + /* Wrong instruction. Throw #GP back for now. */ > + default: > + hvm_inject_hw_exception(TRAP_gp_fault, 0); > + emul_len = 0; > + break; > + } > + /* Fallthrough */ > + case 0x62: /* bound */ > + case 0x9a: /* call (far, absolute) */ I'm slightly loss here, in the case of call or jmp for example, don't you need the instruction pointer to point to the destination of the call/jmp instead of the next instruction? > + case 0xca: /* ret imm16 (far) */ > + case 0xcb: /* ret (far) */ > + case 0xcc: /* int3 */ > + case 0xcd: /* int imm8 */ > + case 0xce: /* into */ > + case 0xcf: /* iret */ > + case 0xea: /* jmp (far, absolute) */ > + case 0xf1: /* icebp */ > + break; > + } > + > + x86_emulate_free_state(state); > + > + return emul_len; > +} > + > +/* > * Local variables: > * mode: C > * c-file-style: "BSD" > diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c > index 049b800e20..ba9c24a70c 100644 > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -2776,7 +2776,41 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) > > case VMEXIT_TASK_SWITCH: { > enum hvm_task_switch_reason reason; > - int32_t errcode = -1; > + int32_t errcode = -1, insn_len = -1; Plain int seem better for insn_len? Also I'm not sure there's a reason that errcode uses int32_t, but that's not introduced here anyway. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |