[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v1] x86/emulate: Send vm_event form emulate



On Mon, Jan 7, 2019 at 2:11 PM Alexandru Stefan ISAILA
<aisaila@xxxxxxxxxxxxxxx> wrote:
>
> This patch aims to have mem access vm events sent from the emulator.
> This is useful in the case of page-walks that have to emulate
> instructions in access denied pages.
>
> We use hvmemul_map_linear_addr() ro intercept r/w access and
> hvmemul_insn_fetch() to intercept exec access.
>
> First we try to send a vm event and if the event is sent then emulation
> returns X86EMUL_ACCESS_EXCEPTION. If the event is not sent then the
> emulation goes on as expected.
>
> Signed-off-by: Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>
> ---
>  xen/arch/x86/hvm/emulate.c             | 298 +++++++++++++++++--------
>  xen/arch/x86/hvm/vm_event.c            |   2 +-
>  xen/arch/x86/mm/mem_access.c           |   4 +-
>  xen/arch/x86/x86_emulate/x86_emulate.h |   1 +
>  xen/include/asm-x86/hvm/emulate.h      |   4 +-
>  5 files changed, 212 insertions(+), 97 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 2d02ef1521..f43aed379b 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -15,6 +15,7 @@
>  #include <xen/paging.h>
>  #include <xen/trace.h>
>  #include <xen/vm_event.h>
> +#include <xen/monitor.h>
>  #include <asm/event.h>
>  #include <asm/i387.h>
>  #include <asm/xstate.h>
> @@ -26,6 +27,8 @@
>  #include <asm/hvm/support.h>
>  #include <asm/hvm/svm/svm.h>
>  #include <asm/vm_event.h>
> +#include <asm/altp2m.h>
> +#include "../mm/mm-locks.h"
>
>  static void hvmtrace_io_assist(const ioreq_t *p)
>  {
> @@ -530,6 +533,157 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa,
>      return hvmemul_do_io_addr(1, mmio_gpa, reps, size, dir, df, ram_gpa);
>  }
>
> +static bool hvmemul_send_vm_event(paddr_t gpa, unsigned long gla, gfn_t gfn,
> +                                  uint32_t pfec, struct hvm_emulate_ctxt 
> *ctxt)
> +{
> +    p2m_access_t access = p2m_access_n;
> +    struct p2m_domain *p2m = NULL;

No need to initialize to NULL or to p2m_access_n.

> +    vm_event_request_t req = {};
> +    p2m_type_t p2mt;
> +    mfn_t mfn;
> +
> +    if ( !ctxt->send_event || !pfec )
> +        return false;
> +
> +    p2m = p2m_get_hostp2m(current->domain);
> +    if ( altp2m_active(current->domain) )
> +        p2m = p2m_get_altp2m(current);
> +    if ( !p2m )
> +        p2m = p2m_get_hostp2m(current->domain);
> +
> +    gfn_lock(p2m, gfn, 0);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &access, 0, NULL, NULL);
> +    gfn_unlock(p2m, gfn, 0);

Don't you need to keep the gfn locked for at lest the duration of the
function? Or else what you put in the req struct below might not be
accurate by the time you write it. Maybe this is not relevant because
this req ends up queued in an async ring, so by the time the other end
reads the request the information might indeed have changed already.

Also, I'm wondering whether there's no helper to fetch the gfn entry
information from the p2m when using altp2m. The above construct (or
variations of it) must be widely used in altp2m code?

> +
> +    if ( mfn_eq(mfn, INVALID_MFN) )
> +        return false;
> +
> +    switch ( access ) {
> +    case p2m_access_x:
> +    case p2m_access_rx:
> +        if ( pfec & PFEC_write_access )
> +            req.u.mem_access.flags = MEM_ACCESS_R | MEM_ACCESS_W;
> +        break;

Newline.

> +    case p2m_access_w:
> +    case p2m_access_rw:
> +        if ( pfec & PFEC_insn_fetch )
> +            req.u.mem_access.flags = MEM_ACCESS_X;
> +        break;

Newline.

> +    case p2m_access_r:
> +    case p2m_access_n:
> +        if ( pfec & PFEC_write_access )
> +            req.u.mem_access.flags |= MEM_ACCESS_R | MEM_ACCESS_W;
> +        if ( pfec & PFEC_insn_fetch )
> +            req.u.mem_access.flags |= MEM_ACCESS_X;
> +        break;

Newline.

> +    default:
> +        return false;
> +    }

I'm not sure the switch is needed, you can't have a PFEC_write_access
for example if the p2m type is p2m_access_w or p2m_access_rw, hence it
seems like it could be simplified to only take the pfec into account?

> +
> +    if ( !req.u.mem_access.flags )
> +        return false; //no violation

Wrong comment style.

> +
> +    req.reason = VM_EVENT_REASON_MEM_ACCESS;
> +    req.u.mem_access.gfn = gfn_x(gfn);
> +    req.u.mem_access.flags |= MEM_ACCESS_FAULT_WITH_GLA | 
> MEM_ACCESS_GLA_VALID;
> +    req.u.mem_access.gla = gla;
> +    req.u.mem_access.offset = gpa & ((1 << PAGE_SHIFT) - 1);
> +
> +    if ( monitor_traps(current, true, &req) < 0 )
> +        return false;
> +
> +    return true;

I think you can simplify this to:

return monitor_traps(current, true, &req) < 0 ? false : true;

> +}
> +
> +/*
> + * Convert addr from linear to physical form, valid over the range
> + * [addr, addr + *reps * bytes_per_rep]. *reps is adjusted according to
> + * the valid computed range. It is always >0 when X86EMUL_OKAY is returned.
> + * @pfec indicates the access checks to be performed during page-table walks.
> +*/
> +static int hvmemul_linear_to_phys(
> +    unsigned long addr,
> +    paddr_t *paddr,
> +    unsigned int bytes_per_rep,
> +    unsigned long *reps,
> +    uint32_t pfec,
> +    struct hvm_emulate_ctxt *hvmemul_ctxt)
> +{

You seem to use two different coding styles in the same file for
function definitions, please use a single one that matches the
surrounding style.

> +    struct vcpu *curr = current;
> +    unsigned long pfn, npfn, done, todo, i, offset = addr & ~PAGE_MASK;
> +    int reverse;
> +
> +    /*
> +     * Clip repetitions to a sensible maximum. This avoids extensive looping 
> in
> +     * this function while still amortising the cost of I/O trap-and-emulate.
> +     */
> +    *reps = min_t(unsigned long, *reps, 4096);
> +
> +    /* With no paging it's easy: linear == physical. */
> +    if ( !(curr->arch.hvm.guest_cr[0] & X86_CR0_PG) )
> +    {
> +        *paddr = addr;
> +        return X86EMUL_OKAY;
> +    }
> +
> +    /* Reverse mode if this is a backwards multi-iteration string operation. 
> */
> +    reverse = (hvmemul_ctxt->ctxt.regs->eflags & X86_EFLAGS_DF) && (*reps > 
> 1);
> +
> +    if ( reverse && ((PAGE_SIZE - offset) < bytes_per_rep) )
> +    {
> +        /* Do page-straddling first iteration forwards via recursion. */
> +        paddr_t _paddr;
> +        unsigned long one_rep = 1;
> +        int rc = hvmemul_linear_to_phys(
> +            addr, &_paddr, bytes_per_rep, &one_rep, pfec, hvmemul_ctxt);

Newline between variable declaration and code.

> +        if ( rc != X86EMUL_OKAY )
> +            return rc;
> +        pfn = _paddr >> PAGE_SHIFT;
> +    }
> +    else if ( (pfn = paging_gva_to_gfn(curr, addr, &pfec)) == 
> gfn_x(INVALID_GFN) )
> +    {
> +        if ( pfec & (PFEC_page_paged | PFEC_page_shared) )
> +            return X86EMUL_RETRY;
> +        *reps = 0;
> +        x86_emul_pagefault(pfec, addr, &hvmemul_ctxt->ctxt);
> +        return X86EMUL_EXCEPTION;
> +    }
> +
> +    done = reverse ? bytes_per_rep + offset : PAGE_SIZE - offset;
> +    todo = *reps * bytes_per_rep;
> +    for ( i = 1; done < todo; i++ )
> +    {
> +        /* Get the next PFN in the range. */
> +        addr += reverse ? -PAGE_SIZE : PAGE_SIZE;
> +        npfn = paging_gva_to_gfn(curr, addr, &pfec);
> +
> +        /* Is it contiguous with the preceding PFNs? If not then we're done. 
> */
> +        if ( (npfn == gfn_x(INVALID_GFN)) ||
> +             (npfn != (pfn + (reverse ? -i : i))) )
> +        {
> +            if ( pfec & (PFEC_page_paged | PFEC_page_shared) )
> +                return X86EMUL_RETRY;
> +            done /= bytes_per_rep;
> +            if ( done == 0 )
> +            {
> +                ASSERT(!reverse);
> +                if ( npfn != gfn_x(INVALID_GFN) )
> +                    return X86EMUL_UNHANDLEABLE;
> +                *reps = 0;
> +                x86_emul_pagefault(pfec, addr & PAGE_MASK, 
> &hvmemul_ctxt->ctxt);
> +                return X86EMUL_EXCEPTION;
> +            }
> +            *reps = done;
> +            break;
> +        }
> +
> +        done += PAGE_SIZE;
> +    }
> +
> +    *paddr = ((paddr_t)pfn << PAGE_SHIFT) | offset;
> +    return X86EMUL_OKAY;
> +}
> +
>  /*
>   * Map the frame(s) covering an individual linear access, for writeable
>   * access.  May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors
> @@ -577,6 +731,7 @@ static void *hvmemul_map_linear_addr(
>          pagefault_info_t pfinfo;
>          p2m_type_t p2mt;
>          unsigned long addr = i ? (linear + (i << PAGE_SHIFT)) & PAGE_MASK : 
> linear;
> +        gfn_t gfn;
>
>          if ( hvmemul_ctxt->ctxt.addr_size < 64 )
>              addr = (uint32_t)addr;
> @@ -585,7 +740,7 @@ static void *hvmemul_map_linear_addr(
>          ASSERT(mfn_x(*mfn) == 0);
>
>          res = hvm_translate_get_page(curr, addr, true, pfec,
> -                                     &pfinfo, &page, NULL, &p2mt);
> +                                     &pfinfo, &page, &gfn, &p2mt);
>
>          switch ( res )
>          {
> @@ -615,6 +770,23 @@ static void *hvmemul_map_linear_addr(
>
>          if ( pfec & PFEC_write_access )
>          {
> +            unsigned long reps = 1;
> +            struct hvm_emulate_ctxt old;
> +            int rc = 0;
> +            paddr_t gpa;
> +
> +            old = *hvmemul_ctxt;
> +            rc = hvmemul_linear_to_phys(
> +                 addr, &gpa, bytes, &reps, pfec, hvmemul_ctxt);
> +            if ( rc == X86EMUL_EXCEPTION )
> +                *hvmemul_ctxt = old;
> +
> +            if ( hvmemul_send_vm_event(gpa, addr, gfn, pfec, hvmemul_ctxt) )
> +            {
> +                err = ERR_PTR(~X86EMUL_ACCESS_EXCEPTION);
> +                goto out;
> +            }

It seems to me that you could send the VM event in
hvmemul_linear_to_phys directly instead of doing it in the caller. The
same patter is seen below.

> +
>              if ( p2m_is_discard_write(p2mt) )
>              {
>                  err = ERR_PTR(~X86EMUL_OKAY);
> @@ -694,96 +866,6 @@ static void hvmemul_unmap_linear_addr(
>  #endif
>  }
>
> -/*
> - * Convert addr from linear to physical form, valid over the range
> - * [addr, addr + *reps * bytes_per_rep]. *reps is adjusted according to
> - * the valid computed range. It is always >0 when X86EMUL_OKAY is returned.
> - * @pfec indicates the access checks to be performed during page-table walks.
> - */
> -static int hvmemul_linear_to_phys(
> -    unsigned long addr,
> -    paddr_t *paddr,
> -    unsigned int bytes_per_rep,
> -    unsigned long *reps,
> -    uint32_t pfec,
> -    struct hvm_emulate_ctxt *hvmemul_ctxt)

If you need to move code around, please do it in a pre-patch without
introducing any change. That way it's much more easy for reviews to
identify and review your code changes.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.