[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10 1/9] x86emul: address x86_insn_is_mem_{access, write}() omissions

On 29/05/2020 14:29, Jan Beulich wrote:
> On 29.05.2020 14:18, Andrew Cooper wrote:
>> On 25/05/2020 15:26, Jan Beulich wrote:
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -11474,25 +11474,87 @@ x86_insn_operand_ea(const struct x86_emu
>>>      return state->ea.mem.off;
>>>  }
>>> +/*
>>> + * This function means to return 'true' for all supported insns with 
>>> explicit
>>> + * accesses to memory.  This means also insns which don't have an explicit
>>> + * memory operand (like POP), but it does not mean e.g. segment selector
>>> + * loads, where the descriptor table access is considered an implicit one.
>>> + */
>>>  bool
>>>  x86_insn_is_mem_access(const struct x86_emulate_state *state,
>>>                         const struct x86_emulate_ctxt *ctxt)
>>>  {
>>> +    if ( mode_64bit() && state->not_64bit )
>>> +        return false;
>> Is this path actually used?
> Yes, it is. It's only x86_emulate() which has
>     generate_exception_if(state->not_64bit && mode_64bit(), EXC_UD);
> right now.

Oh.  That is a bit awkward.

>> state->not_64bit ought to fail instruction
>> decode, at which point we wouldn't have a valid state to be used here.
> x86_decode() currently doesn't have much raising of #UD at all, I
> think. If it wasn't like this, the not_64bit field wouldn't be
> needed - it's used only to communicate from decode to execute.
> We're not entirely consistent with this though, seeing in
> x86_decode_onebyte(), a few lines below the block of case labels
> setting that field,
>     case 0x9a: /* call (far, absolute) */
>     case 0xea: /* jmp (far, absolute) */
>         generate_exception_if(mode_64bit(), EXC_UD);

This is because there is no legitimate way to determine the end of the

Most of the not_64bit instructions do have a well-defined end, even if
they aren't permitted for use.

> We could certainly drop the field and raise #UD during decode
> already, but don't you think we then should do so for all
> encodings that ultimately lead to #UD, e.g. also for AVX insns
> without AVX available to the guest? This would make
> x86_decode() quite a bit larger, as it would then also need to
> have a giant switch() (or something else that's suitable to
> cover all cases).

I think there is a semantic difference between "we can't parse the
instruction", and "we can parse it, but it's not legitimate in this

I don't think our exact split is correct, but I don't think moving all
#UD checking into x86_decode() is correct either.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.