[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] x86/svm: Write the correct %eip into the outgoing task



On 21.11.2019 23:15, Andrew Cooper wrote:
> --- a/xen/arch/x86/hvm/svm/emulate.c
> +++ b/xen/arch/x86/hvm/svm/emulate.c
> @@ -117,6 +117,61 @@ unsigned int svm_get_insn_len(struct vcpu *v, unsigned 
> int instr_enc)
>  }
>  
>  /*
> + * TASK_SWITCH vmexits never provide an instruction length.  We must always
> + * decode under %rip to find the answer.
> + */
> +unsigned int svm_get_task_switch_insn_len(struct vcpu *v)
> +{
> +    struct hvm_emulate_ctxt ctxt;
> +    struct x86_emulate_state *state;
> +    unsigned int emul_len, modrm_reg;
> +
> +    ASSERT(v == current);

You look to be using v here just for this ASSERT() - is this really
worth it? By making the function take "void" it would be quite obvious
that it would act on the current vCPU only.

> +    hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
> +    hvm_emulate_init_per_insn(&ctxt, NULL, 0);
> +    state = x86_decode_insn(&ctxt.ctxt, hvmemul_insn_fetch);
> +    if ( IS_ERR_OR_NULL(state) )
> +        return 0;
> +
> +    emul_len = x86_insn_length(state, &ctxt.ctxt);
> +
> +    /*
> +     * Check for an instruction which can cause a task switch.  Any far
> +     * jmp/call/ret, any software interrupt/exception, and iret.
> +     */
> +    switch ( ctxt.ctxt.opcode )
> +    {
> +    case 0xff: /* Grp 5 */
> +        /* call / jmp (far, absolute indirect) */
> +        if ( x86_insn_modrm(state, NULL, &modrm_reg) != 3 ||

DYM "== 3", to bail upon non-memory operands?

> +             (modrm_reg != 3 && modrm_reg != 5) )
> +        {
> +            /* Wrong instruction.  Throw #GP back for now. */
> +    default:
> +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
> +            emul_len = 0;
> +            break;
> +        }
> +        /* Fallthrough */
> +    case 0x62: /* bound */

Does "bound" really belong on this list? It raising #BR is like
insns raising random other exceptions, not like INTO / INT3,
where the IDT descriptor also has to have suitable DPL for the
exception to actually get delivered (rather than #GP). I.e. it
shouldn't make it here in the first place, due to the
X86_EVENTTYPE_HW_EXCEPTION check in the caller.

IOW if "bound" needs to be here, then all others need to be as
well, unless they can't cause any exception at all.

> +    case 0x9a: /* call (far, absolute) */
> +    case 0xca: /* ret imm16 (far) */
> +    case 0xcb: /* ret (far) */
> +    case 0xcc: /* int3 */
> +    case 0xcd: /* int imm8 */
> +    case 0xce: /* into */
> +    case 0xcf: /* iret */
> +    case 0xea: /* jmp (far, absolute) */
> +    case 0xf1: /* icebp */

Same perhaps for ICEBP, albeit I'm less certain here, as its
behavior is too poorly documented (if at all).

> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -2776,7 +2776,41 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
>  
>      case VMEXIT_TASK_SWITCH: {
>          enum hvm_task_switch_reason reason;
> -        int32_t errcode = -1;
> +        int32_t errcode = -1, insn_len = -1;
> +
> +        /*
> +         * All TASK_SWITCH intercepts have fault-like semantics.  NRIP is
> +         * never provided, even for instruction-induced task switches, but we
> +         * need to know the instruction length in order to set %eip suitably
> +         * in the outgoing TSS.
> +         *
> +         * For a task switch which vectored through the IDT, look at the type
> +         * to distinguish interrupts/exceptions from instruction based
> +         * switches.
> +         */
> +        if ( vmcb->eventinj.fields.v )
> +        {
> +            /*
> +             * HW_EXCEPTION, NMI and EXT_INTR are not instruction based.  All
> +             * others are.
> +             */
> +            if ( vmcb->eventinj.fields.type <= X86_EVENTTYPE_HW_EXCEPTION )
> +                insn_len = 0;
> +
> +            /*
> +             * Clobber the vectoring information, as we are going to emulate
> +             * the task switch in full.
> +             */
> +            vmcb->eventinj.bytes = 0;
> +        }
> +
> +        /*
> +         * insn_len being -1 indicates that we have an instruction-induced
> +         * task switch.  Decode under %rip to find its length.
> +         */
> +        if ( insn_len < 0 && (insn_len = svm_get_task_switch_insn_len(v)) == 
> 0 )
> +            break;

Won't this live-lock the guest? I.e. isn't it better to e.g. crash it
if svm_get_task_switch_insn_len() didn't raise #GP(0)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.