[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 1/5] x86/32on64: use generic instruction decoding for call gate emulation



On 06/12/16 13:17, Jan Beulich wrote:
>>>> On 06.12.16 at 12:56, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 06/12/16 11:12, Jan Beulich wrote:
>>> +static int gate_op_read(
>>> +    enum x86_segment seg,
>>> +    unsigned long offset,
>>> +    void *p_data,
>>> +    unsigned int bytes,
>>> +    struct x86_emulate_ctxt *ctxt)
>>> +{
>>> +    const struct gate_op_ctxt *goc =
>>> +        container_of(ctxt, struct gate_op_ctxt, ctxt);
>>> +    unsigned int rc = bytes, sel = 0;
>>> +    unsigned long addr = offset, limit = 0;
>>> +
>>> +    switch ( seg )
>>> +    {
>>> +    case x86_seg_cs:
>>> +        addr += goc->cs.base;
>>> +        limit = goc->cs.limit;
>>> +        break;
>>> +    case x86_seg_ds:
>>> +        sel = read_sreg(ds);
>>> +        break;
>>> +    case x86_seg_es:
>>> +        sel = read_sreg(es);
>>> +        break;
>>> +    case x86_seg_fs:
>>> +        sel = read_sreg(fs);
>>> +        break;
>>> +    case x86_seg_gs:
>>> +        sel = read_sreg(gs);
>>> +        break;
>>> +    case x86_seg_ss:
>>> +        sel = ctxt->regs->ss;
>>> +        break;
>>> +    default:
>>> +        return X86EMUL_UNHANDLEABLE;
>>> +    }
>>> +    if ( sel )
>>> +    {
>>> +        unsigned int ar;
>>> +
>>> +        ASSERT(!goc->insn_fetch);
>>> +        if ( !read_descriptor(sel, current, &addr, &limit, &ar, 0) ||
>>> +             !(ar & _SEGMENT_S) ||
>>> +             !(ar & _SEGMENT_P) ||
>>> +             ((ar & _SEGMENT_CODE) && !(ar & _SEGMENT_WR)) )
>>> +            return X86EMUL_UNHANDLEABLE;
>>> +        addr += offset;
>>> +    }
>>> +    else if ( seg != x86_seg_cs )
>>> +        return X86EMUL_UNHANDLEABLE;
>>> +
>>> +    if ( limit < bytes - 1 || offset > limit - bytes + 1 )
>> Is this correct for the zero-length instruction fetches which the
>> emulator emits?  It would be better to make it safe now, than to
>> introduce a latent bug in the future.
> The left side of the || unconditionally fails such fetches, and that's
> precisely what we want here (as tight a condition as possible)
> considering that this gets used only for decoding, not for executing
> instructions.

Mind adding a similar comment to patch 5 along the lines of?

/* We don't mean to emulate any branches. */

>
>>> +        return X86EMUL_UNHANDLEABLE;
>>> +
>>> +    if ( is_pv_32bit_vcpu(current) )
>>> +        addr = (uint32_t)addr;
>> This should be based on %cs attributes.  What about a 64bit PV guest in
>> a compatability segment?
> No - I now realize the conditional is pointless. We only do gate op
> emulation for 32-bit guests.

Good point.  Do we actually ASSERT() this anywhere?  It looks like this
properly is only a side effect of emulate_gate_op()'s caller.

>
>>> +
>>> +    if ( !__addr_ok(addr) ||
>>> +         (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
>>> +    {
>>> +        x86_emul_pagefault(goc->insn_fetch && cpu_has_nx
>>> +                           ? PFEC_insn_fetch : 0,
>>> +                           addr + bytes - rc, ctxt);
>> Independently to the point below, the correct predicate for setting
>> PFEC_insn_fetch is ((EFER & EFER_NX) || (CR4 & SMEP))
> Remember that here we're dealing with PV guests only - those don't
> see any SMEP, and always run with NX enabled as long as we do
> ourselves.

Conceptually, this is wrong.

For PV guests, we should always base paging decisions on what is really
in EFER and CR4, not on a theoretical guests view of the world.  For
pagefaults which we decide isn't relevant to Xen, the propagated fault
will be using the real EFER and CR4.

In this case (being 32bit only) SMEP is indeed disabled in the context
of the guest, but for 64bit PV guests, Xen's SMEP should be considered,
as it will affect the guests view of hardware.  However, ...

>> However, this is subtly wrong, but I can't think of a sensible way of
>> fixing it (i.e. without doing independent pagewalks).
>>
>> __copy_from_user() does a data fetch, not an instruction fetch.
>>
>> With the advent of PKU, processors support pages which can't be read,
>> but can be executed.  Then again, our chances of ever supporting PKU for
>> PV guests is slim, so maybe this point isn't very important.
>>
>> As we actually do a data read, the error code should remain 0.  This
>> (getting a data fetch failure for something which should have been an
>> instruction fetch) will be less confusing for the guest than claiming an
>> instruction fetch failure when we actually did a data fetch, as at least
>> the error code will be correct for the access rights in the translation.
> With the above I think this should remain as is.

... Why?  The problem is still unresolved.

__copy_from_user() will successfully read a bytestream which was
protected in the pagetables by the NX bit, because it performs a data
read not an instruction fetch.

In the case of a fault occurred, Xen should report to the guest that it
performed a data read, not an instruction fetch, because that's what it
actually did.

A guest might rightly expect an instruction fetch fault if it was paying
close attention, but we already currently provide a data read failure,
and this is better than fabricating a fault of what Xen should have done
(but didn't).

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.