[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] x86/svm: Write the correct %eip into the outgoing task


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 22 Nov 2019 13:55:29 +0000
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@xxxxxxxxxx; spf=Pass smtp.mailfrom=Andrew.Cooper3@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Juergen Gross <jgross@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 22 Nov 2019 13:55:37 +0000
  • Ironport-sdr: P1v6X5cBVTtX1FNzGcqVuhkDEVlSre/dJ6gWYfdGhSytXcCtppt9m4xpfOiAv7P9ZyMjIj5+Ug 4hOOEBJ3Jf59qy3JS3da5A6RhytNcohv9GUW0oAlLlmr9sUkLmPQJM55rfOJMipmUoyIwx5vHp rJj2k4k3xpPogaI8U+xTr2W3PUE/RAysDtTOsII37iynHB3vT0wJiJyl1jnNIArKmYT0uj7q9D /KKayIgL00V1zUxt0OVAMKD95L8MGFzi37VgPSNsfAwPXrmjwmL0QphHYqCcXMP6ZS4mfA0fk1 nY0=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 22/11/2019 13:31, Jan Beulich wrote:
> On 21.11.2019 23:15, Andrew Cooper wrote:
>> --- a/xen/arch/x86/hvm/svm/emulate.c
>> +++ b/xen/arch/x86/hvm/svm/emulate.c
>> @@ -117,6 +117,61 @@ unsigned int svm_get_insn_len(struct vcpu *v, unsigned 
>> int instr_enc)
>>  }
>>  
>>  /*
>> + * TASK_SWITCH vmexits never provide an instruction length.  We must always
>> + * decode under %rip to find the answer.
>> + */
>> +unsigned int svm_get_task_switch_insn_len(struct vcpu *v)
>> +{
>> +    struct hvm_emulate_ctxt ctxt;
>> +    struct x86_emulate_state *state;
>> +    unsigned int emul_len, modrm_reg;
>> +
>> +    ASSERT(v == current);
> You look to be using v here just for this ASSERT() - is this really
> worth it? By making the function take "void" it would be quite obvious
> that it would act on the current vCPU only.

This was cribbed largely from svm_get_insn_len(), which also behaves the
same.

>
>> +    hvm_emulate_init_once(&ctxt, NULL, guest_cpu_user_regs());
>> +    hvm_emulate_init_per_insn(&ctxt, NULL, 0);
>> +    state = x86_decode_insn(&ctxt.ctxt, hvmemul_insn_fetch);
>> +    if ( IS_ERR_OR_NULL(state) )
>> +        return 0;
>> +
>> +    emul_len = x86_insn_length(state, &ctxt.ctxt);
>> +
>> +    /*
>> +     * Check for an instruction which can cause a task switch.  Any far
>> +     * jmp/call/ret, any software interrupt/exception, and iret.
>> +     */
>> +    switch ( ctxt.ctxt.opcode )
>> +    {
>> +    case 0xff: /* Grp 5 */
>> +        /* call / jmp (far, absolute indirect) */
>> +        if ( x86_insn_modrm(state, NULL, &modrm_reg) != 3 ||
> DYM "== 3", to bail upon non-memory operands?

Ah yes (and this demonstrates that I really need to get an XTF tested
sorted soon.)

>
>> +             (modrm_reg != 3 && modrm_reg != 5) )
>> +        {
>> +            /* Wrong instruction.  Throw #GP back for now. */
>> +    default:
>> +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
>> +            emul_len = 0;
>> +            break;
>> +        }
>> +        /* Fallthrough */
>> +    case 0x62: /* bound */
> Does "bound" really belong on this list? It raising #BR is like
> insns raising random other exceptions, not like INTO / INT3,
> where the IDT descriptor also has to have suitable DPL for the
> exception to actually get delivered (rather than #GP). I.e. it
> shouldn't make it here in the first place, due to the
> X86_EVENTTYPE_HW_EXCEPTION check in the caller.
>
> IOW if "bound" needs to be here, then all others need to be as
> well, unless they can't cause any exception at all.

More experimentation required.  BOUND doesn't appear to be special cased
by SVM, but is by VT-x.  VT-x however does throw it in the same category
as #UD, and identify it to be a hardware exception.

I suspect you are right, and t doesn't want to be here.

>> +    case 0x9a: /* call (far, absolute) */
>> +    case 0xca: /* ret imm16 (far) */
>> +    case 0xcb: /* ret (far) */
>> +    case 0xcc: /* int3 */
>> +    case 0xcd: /* int imm8 */
>> +    case 0xce: /* into */
>> +    case 0xcf: /* iret */
>> +    case 0xea: /* jmp (far, absolute) */
>> +    case 0xf1: /* icebp */
> Same perhaps for ICEBP, albeit I'm less certain here, as its
> behavior is too poorly documented (if at all).

ICEBP's #DB is a trap, not a fault, so instruction length is important.

>
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -2776,7 +2776,41 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
>>  
>>      case VMEXIT_TASK_SWITCH: {
>>          enum hvm_task_switch_reason reason;
>> -        int32_t errcode = -1;
>> +        int32_t errcode = -1, insn_len = -1;
>> +
>> +        /*
>> +         * All TASK_SWITCH intercepts have fault-like semantics.  NRIP is
>> +         * never provided, even for instruction-induced task switches, but 
>> we
>> +         * need to know the instruction length in order to set %eip suitably
>> +         * in the outgoing TSS.
>> +         *
>> +         * For a task switch which vectored through the IDT, look at the 
>> type
>> +         * to distinguish interrupts/exceptions from instruction based
>> +         * switches.
>> +         */
>> +        if ( vmcb->eventinj.fields.v )
>> +        {
>> +            /*
>> +             * HW_EXCEPTION, NMI and EXT_INTR are not instruction based.  
>> All
>> +             * others are.
>> +             */
>> +            if ( vmcb->eventinj.fields.type <= X86_EVENTTYPE_HW_EXCEPTION )
>> +                insn_len = 0;
>> +
>> +            /*
>> +             * Clobber the vectoring information, as we are going to emulate
>> +             * the task switch in full.
>> +             */
>> +            vmcb->eventinj.bytes = 0;
>> +        }
>> +
>> +        /*
>> +         * insn_len being -1 indicates that we have an instruction-induced
>> +         * task switch.  Decode under %rip to find its length.
>> +         */
>> +        if ( insn_len < 0 && (insn_len = svm_get_task_switch_insn_len(v)) 
>> == 0 )
>> +            break;
> Won't this live-lock the guest?

Potentially, yes.

> I.e. isn't it better to e.g. crash it
> if svm_get_task_switch_insn_len() didn't raise #GP(0)?

No - that would need and XSA if we got it wrong, as none of these are
privileged instruction.

However, it occurs to me that we are in a position to use
svm_crash_or_fault(), so I'll respin with that in mind.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.