[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC V8 4/5] xen, libxc: Request page fault injection via libxc



> From: Razvan Cojocaru [mailto:rcojocaru@xxxxxxxxxxxxxxx]
> Sent: Wednesday, August 27, 2014 7:02 AM
> 
> Added new XEN_DOMCTL_set_pagefault_info hypercall, used by libxc's
> new xc_domain_set_pagefault_info() function to set per-domain page
> fault injection information. All a call does is set per-domain info,
> and nothing actually happens until VMENTRY time, and then only if
> all conditions are met (the guest is in user mode, the set value
> matches CR3, and there are no other pending traps).
> This mechanism allows bringing in swapped-out pages for inspection.

Is there any impact on the whole introspection process, regarding to when
next VM exit on the inspected address space actually comes? 

btw how is the whole flow of inspecting swapped-out pages with this
interface?

> 
> Changes since V2:
>  - Removed superfluous memset(&ctxt, 0, sizeof(struct hvm_hw_cpu)).
>  - Now checking SS.DPL instead of CS.DPL to see if the guest is in
>    user mode.
>  - Removed superfluous fault_info members initialization.
>  - Now returning -EINVAL if !has_hvm_container_domain(d) on
>    XEN_DOMCTL_set_pagefault_info.
>  - Moved struct fault_info from struct domain to struct hvm_domain.
>  - Collapsed explicit initialization of the fault_info members into a
>    one-line assignment on the tools/libxc side.
>  - Now using write_access as a proper bool instead of shifting and
>    OR-ing it with PFEC_user_mode.
> 
> Changes since V3:
>  - Now checking that is_hvm_domain(d) in do_domctl() (consistent with
>    the test in the actual page fault injection code).
>  - XEN_DOMCTL_set_pagefault_info now extends the proper range.
>  - Added a likely() around the
>    d->arch.hvm_domain.fault_info.virtual_address == 0 check.
>  - Replaced hvm_funcs.save_cpu_ctxt() function call with lighter code.
>  - Split the vmx.c page fault injection function into a check
>    function and an inject function.
> 
> Changes since V4:
>  - Added "valid" flag instead of relying on virtual address 0 as a
>    special case.
>  - Using #defines instead of magic constants in vmx_vmcs_save().
>  - Padded struct xen_domctl_set_pagefault_info.
>  - Coding style and readability changes.
> 
> Changes since V5:
>  - Renamed "struct domain *d" to "currd" where appropriate.
>  - Using ! to test bool_t variables.
>  - Replaced hvm_get_segment_register() with
>    vmx_get_segment_register() in vmx.c.
>  - Removed double blank line.
>  - Proper comparison of CR3 and address_space (taking into account
>    LM, PAE).
>  - Using domain variable already set up in do_domctl().
>  - Removed optional parentheses.
>  - Removed d->arch.hvm_domain.fault_info reset (only the "valid"
>    flag counts anyway).
> 
> Changes since V6:
>  - Removed the write_access flag, now passing the whole error code
>    (still OR-ed with PFEC_user_mode).
>  - Replaced hex constants with #defined macros.
>  - Minor code / coding style cleanup.
> 
> Changes since V7:
>  - Removed pointless parentheses and curly braces.
>  - Renamed xc_domain_set_pagefault_info() to
>    xc_domain_request_pagefault().
> 
> Signed-off-by: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>
> Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> ---
>  tools/libxc/xc_domain.c          |   16 +++++++++++
>  tools/libxc/xenctrl.h            |    4 +++
>  xen/arch/x86/hvm/vmx/vmx.c       |   56
> ++++++++++++++++++++++++++++++++++++++
>  xen/common/domctl.c              |   17 ++++++++++++
>  xen/include/asm-x86/hvm/domain.h |    8 ++++++
>  xen/include/public/domctl.h      |   17 ++++++++++++
>  6 files changed, 118 insertions(+)
> 
> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> index c67ac9a..3fb03be 100644
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -507,6 +507,22 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
>      return ret;
>  }
> 
> +int xc_domain_request_pagefault(xc_interface *xch,
> +                                uint32_t domid,
> +                                xen_domctl_request_pagefault_info_t
> *info)
> +{
> +    DECLARE_DOMCTL;
> +
> +    if (info == NULL)
> +        return -1;
> +
> +    domctl.cmd = XEN_DOMCTL_request_pagefault;
> +    domctl.domain = (domid_t)domid;
> +    domctl.u.request_pagefault_info = *info;
> +
> +    return do_domctl(xch, &domctl);
> +}
> +
>  int xc_vcpu_getcontext(xc_interface *xch,
>                         uint32_t domid,
>                         uint32_t vcpu,
> diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
> index 28b5562..9968e83 100644
> --- a/tools/libxc/xenctrl.h
> +++ b/tools/libxc/xenctrl.h
> @@ -803,6 +803,10 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
>  const char *xc_domain_get_native_protocol(xc_interface *xch,
>                                            uint32_t domid);
> 
> +int xc_domain_request_pagefault(xc_interface *xch,
> +                                uint32_t domid,
> +                                xen_domctl_request_pagefault_info_t
> *info);
> +
>  /**
>   * This function returns information about the execution context of a
>   * particular vcpu of a domain.
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 40e6c2c..0152f4e 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3107,6 +3107,59 @@ out:
>          nvmx_idtv_handling();
>  }
> 
> +static bool_t vmx_check_pf_injection(void)
> +{

aside from structure name pointed earlier, above and later function names
are also too generic while their purpose is very introspection specific. Leaving
them this way will confuse other developers in vmx module in the future.

> +    struct vcpu *curr = current;
> +    const struct domain *currd = curr->domain;
> +    struct segment_register seg;
> +    unsigned long ev;
> +    uint32_t pending_event = 0;
> +    uint64_t mask;
> +
> +    if ( !is_hvm_domain(currd) ||
> +         likely(!currd->arch.hvm_domain.fault_info.valid) )
> +        return 0;
> +
> +    vmx_get_segment_register(curr, x86_seg_ss, &seg);
> +
> +    if ( seg.attr.fields.dpl != 3 ) /* Guest is not in user mode */
> +        return 0;
> +
> +    if ( hvm_long_mode_enabled(curr) )
> +        mask = PADDR_MASK & PAGE_MASK; /* Bits 51:12. */
> +    else if ( hvm_pae_enabled(curr) )
> +        mask = 0x00000000ffffffe0; /* Bits 31:5. */
> +    else
> +        mask = (uint32_t)PAGE_MASK; /* Bits 31:12. */
> +
> +    if ( (curr->arch.hvm_vcpu.guest_cr[3] & mask) !=
> +         (currd->arch.hvm_domain.fault_info.address_space & mask) )
> +        return 0;
> +
> +    vmx_vmcs_enter(curr);
> +    __vmread(VM_ENTRY_INTR_INFO, &ev);
> +    vmx_vmcs_exit(curr);
> +
> +    if ( (ev & INTR_INFO_VALID_MASK) &&
> +         hvm_event_needs_reinjection(MASK_EXTR(ev,
> INTR_INFO_INTR_TYPE_MASK),
> +                                     ev &
> INTR_INFO_VECTOR_MASK) )
> +        pending_event = ev;
> +
> +    return pending_event == 0;
> +}
> +
> +static void vmx_inject_pf(void)
> +{
> +    struct vcpu *curr = current;
> +    struct domain *currd = curr->domain;
> +    uint64_t virtual_address =
> currd->arch.hvm_domain.fault_info.virtual_address;
> +
> +    currd->arch.hvm_domain.fault_info.valid = 0;

Do you need handle the race with next coming fault injection hypercall?

Is the memory introspection caring about how many page faults are injected
at one time? two vcpus may both enter the same address space, and trigger
the VM exit at the same time, then it's non-deterministic whether both vcpus
will receive a page fault.

> +
> +    hvm_inject_page_fault(currd->arch.hvm_domain.fault_info.errcode |
> +                          PFEC_user_mode, virtual_address);
> +}
> +
>  void vmx_vmenter_helper(const struct cpu_user_regs *regs)
>  {
>      struct vcpu *curr = current;
> @@ -3147,6 +3200,9 @@ void vmx_vmenter_helper(const struct
> cpu_user_regs *regs)
>      if ( unlikely(need_flush) )
>          vpid_sync_all();
> 
> +    if ( vmx_check_pf_injection() )
> +        vmx_inject_pf();
> +
>   out:
>      HVMTRACE_ND(VMENTRY, 0, 1/*cycles*/, 0, 0, 0, 0, 0, 0, 0);
> 
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index c326aba..75ff6c6 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -967,6 +967,23 @@ long
> do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>      }
>      break;
> 
> +    case XEN_DOMCTL_request_pagefault:
> +        ret = -EINVAL;
> +
> +        if ( is_hvm_domain(d) )
> +        {
> +            d->arch.hvm_domain.fault_info.address_space =
> +                op->u.request_pagefault_info.address_space;
> +            d->arch.hvm_domain.fault_info.virtual_address =
> +                op->u.request_pagefault_info.virtual_address;
> +            d->arch.hvm_domain.fault_info.errcode =
> +                op->u.request_pagefault_info.errcode;
> +            d->arch.hvm_domain.fault_info.valid = 1;
> +
> +            ret = 0;
> +        }
> +        break;
> +
>      default:
>          ret = arch_do_domctl(op, d, u_domctl);
>          break;
> diff --git a/xen/include/asm-x86/hvm/domain.h
> b/xen/include/asm-x86/hvm/domain.h
> index 30d4aa3..30956e4 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -142,6 +142,14 @@ struct hvm_domain {
>       */
>      uint64_t sync_tsc;
> 
> +    /* Memory introspection page fault injection data. */
> +    struct {
> +        uint64_t address_space;
> +        uint64_t virtual_address;
> +        uint32_t errcode;
> +        bool_t valid;
> +    } fault_info;
> +
>      union {
>          struct vmx_domain vmx;
>          struct svm_domain svm;
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index a3431ec..69224be 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -937,6 +937,21 @@ typedef struct xen_domctl_vcpu_msrs
> xen_domctl_vcpu_msrs_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vcpu_msrs_t);
>  #endif
> 
> +/*
> + * XEN_DOMCTL_set_pagefault_info requests that a page fault occur at
> + * the next VMENTRY.
> + */
> +struct xen_domctl_request_pagefault_info {
> +    uint64_t address_space;
> +    uint64_t virtual_address;
> +    uint32_t errcode;
> +    uint32_t _pad;
> +};
> +typedef struct xen_domctl_request_pagefault_info
> +    xen_domctl_request_pagefault_info_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_request_pagefault_info_t);
> +
> +
>  struct xen_domctl {
>      uint32_t cmd;
>  #define XEN_DOMCTL_createdomain                   1
> @@ -1009,6 +1024,7 @@ struct xen_domctl {
>  #define XEN_DOMCTL_cacheflush                    71
>  #define XEN_DOMCTL_get_vcpu_msrs                 72
>  #define XEN_DOMCTL_set_vcpu_msrs                 73
> +#define XEN_DOMCTL_request_pagefault             74
>  #define XEN_DOMCTL_gdbsx_guestmemio            1000
>  #define XEN_DOMCTL_gdbsx_pausevcpu             1001
>  #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
> @@ -1069,6 +1085,7 @@ struct xen_domctl {
>          struct xen_domctl_cacheflush        cacheflush;
>          struct xen_domctl_gdbsx_pauseunp_vcpu gdbsx_pauseunp_vcpu;
>          struct xen_domctl_gdbsx_domstatus   gdbsx_domstatus;
> +        struct xen_domctl_request_pagefault_info request_pagefault_info;
>          uint8_t                             pad[128];
>      } u;
>  };
> --
> 1.7.9.5


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.