Xen project Mailing List

[Xen-devel] [PATCH v10] x86/emulate: Send vm_event from emulate

To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Alexandru Stefan ISAILA <aisaila@xxxxxxxxxxxxxxx>

Date: Mon, 16 Sep 2019 08:10:38 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=bitdefender.com; dmarc=pass action=none header.from=bitdefender.com; dkim=pass header.d=bitdefender.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E9FgZ5aXPEpG7kBK63S2BCk9TObBdch357sB1x2R1tk=; b=MnZkt3nJMvnxGYXAFbcxaDIsuF0qkrO3JvwxDTVO6RmteeClClfMWlXcyLr0GJA5oU9nzP3D23O0/CoAI8W7lVDS27RpxRVaO9c/mCuW4FLZ8cWBvhZVHd8ZEPk+WS7EmeJCgMcWpUAvZ+P5DTtrc/jbCxw3G6dMP+3cDlSbtBF5taBcDqxJPGTiP0vcySgv6V7pN0uQgFcu9Z8+YHeQYnSRTRT+6P8GEFaQ5CUTvWYZXOdak23P7xi92VJGd7ggs0Wf3LxSEpADsfcz5Kx1vDCLINp4C/1aChwXUWCXEie9sUOZwgDI1UNbplPfnLzDkZ6FRpc5Eu+etlz+E5diAg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=O4b6NmUeRIqLw26CPMP/P0e4GXff9w9o0NVsl2mWE+BsbSiyrYgpJCVZ762cQ4VCtGPkGo5uE3qPFHsP7Yr/u8PkHiI9iqNFM2MN5wqTixRKFL2jJoKrjNkMIFi5fd6kilHfgwGrI9YGFVu+WX51oLUVG20L6UnJQpFl7+jCo3wB8N8mRCu2bnnR/57lvAxaryhRHfV3MmFdPF07HTU7uVxnSzpp6QD0YBucsuNvnKfQJ64VTSra72OLCuj0lBfl1iFwOxBab9QBEnH5vxX/VIvIvcS6ZuaIZMiebwZB6oiAla+fcOAUX3JtEpmKmBwKcNUb43aOho/VbN6Ed1Mn6A==

Authentication-results: spf=none (sender IP is ) smtp.mailfrom=aisaila@xxxxxxxxxxxxxxx;

Cc: Petre Ovidiu PIRCALABU <ppircalabu@xxxxxxxxxxxxxxx>, "tamas@xxxxxxxxxxxxx" <tamas@xxxxxxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>, Razvan COJOCARU <rcojocaru@xxxxxxxxxxxxxxx>, "george.dunlap@xxxxxxxxxxxxx" <george.dunlap@xxxxxxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "paul.durrant@xxxxxxxxxx" <paul.durrant@xxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, Alexandru Stefan ISAILA <aisaila@xxxxxxxxxxxxxxx>, "roger.pau@xxxxxxxxxx" <roger.pau@xxxxxxxxxx>

Delivery-date: Mon, 16 Sep 2019 08:10:47 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHVbGY5LYraxeTPu06yM6Qe3Haipw==

Thread-topic: [PATCH v10] x86/emulate: Send vm_event from emulate

A/D bit writes (on page walks) can be considered benign by an introspection agent, so receiving vm_events for them is a pessimization. We try here to optimize by filtering these events out. Currently, we are fully emulating the instruction at RIP when the hardware sees an EPT fault with npfec.kind != npfec_kind_with_gla. This is, however, incorrect, because the instruction at RIP might legitimately cause an EPT fault of its own while accessing a _different_ page from the original one, where A/D were set. The solution is to perform the whole emulation, while ignoring EPT restrictions for the walk part, and taking them into account for the "actual" emulating of the instruction at RIP. When we send out a vm_event, we don't want the emulation to complete, since in that case we won't be able to veto whatever it is doing. That would mean that we can't actually prevent any malicious activity, instead we'd only be able to report on it. When we see a "send-vm_event" case while emulating, we need to first send the event out and then suspend the emulation (return X86EMUL_RETRY). After the emulation stops we'll call hvm_vm_event_do_resume() again after the introspection agent treats the event and resumes the guest. There, the instruction at RIP will be fully emulated (with the EPT ignored) if the introspection application allows it, and the guest will continue to run past the instruction. A common example is if the hardware exits because of an EPT fault caused by a page walk, p2m_mem_access_check() decides if it is going to send a vm_event. If the vm_event was sent and it would be treated so it runs the instruction at RIP, that instruction might also hit a protected page and provoke a vm_event. Now if npfec.kind == npfec_kind_in_gpt and d->arch.monitor.inguest_pagefault_disabled is true then we are in the page walk case and we can do this emulation optimization and emulate the page walk while ignoring the EPT, but don't ignore the EPT for the emulation of the actual instruction. In the first case we would have 2 EPT events, in the second case we would have 1 EPT event if the instruction at the RIP triggers an EPT event. We use hvmemul_map_linear_addr() to intercept write access and __hvm_copy() to intercept exec and read access. hvm_emulate_send_vm_event() can return false if there was no violation, if there was an error from monitor_traps() or p2m_get_mem_access(). Returning false if p2m_get_mem_access() fails is needed because the EPT entry will have rwx memory access rights. NOTE: hvm_emulate_send_vm_event() assumes the caller will check arch.vm_event->send_event Signed-off-by: Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx> --- Changes since V9: - Remove the changes caused by moving the "goto" out of the loop in hvmemul_map_linear_addr(). - Update comment and commit message - Change function name to hvm_monitor_check_p2m(). --- xen/arch/x86/hvm/emulate.c | 11 ++++- xen/arch/x86/hvm/hvm.c | 8 ++++ xen/arch/x86/hvm/monitor.c | 75 +++++++++++++++++++++++++++++++ xen/arch/x86/mm/mem_access.c | 8 +++- xen/include/asm-x86/hvm/monitor.h | 3 ++ xen/include/asm-x86/vm_event.h | 2 + 6 files changed, 105 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 36bcb526d3..22c85937ad 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -548,6 +548,7 @@ static void *hvmemul_map_linear_addr( unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) - (linear >> PAGE_SHIFT) + 1; unsigned int i; + gfn_t gfn; /* * mfn points to the next free slot. All used slots have a page reference @@ -582,7 +583,7 @@ static void *hvmemul_map_linear_addr( ASSERT(mfn_x(*mfn) == 0); res = hvm_translate_get_page(curr, addr, true, pfec, - &pfinfo, &page, NULL, &p2mt); + &pfinfo, &page, &gfn, &p2mt); switch ( res ) { @@ -626,6 +627,14 @@ static void *hvmemul_map_linear_addr( ASSERT(p2mt == p2m_ram_logdirty || !p2m_is_readonly(p2mt)); } + + if ( unlikely(curr->arch.vm_event) && + curr->arch.vm_event->send_event && + hvm_monitor_check_p2m(addr, gfn, pfec, npfec_kind_with_gla) ) + { + err = ERR_PTR(~X86EMUL_RETRY); + goto out; + } } /* Entire access within a single frame? */ diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 452ac4833d..195a07c64d 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3224,6 +3224,14 @@ static enum hvm_translation_result __hvm_copy( return HVMTRANS_bad_gfn_to_mfn; } + if ( unlikely(v->arch.vm_event) && + v->arch.vm_event->send_event && + hvm_monitor_check_p2m(addr, gfn, pfec, npfec_kind_with_gla) ) + { + put_page(page); + return HVMTRANS_gfn_paged_out; + } + p = (char *)__map_domain_page(page) + (addr & ~PAGE_MASK); if ( flags & HVMCOPY_to_guest ) diff --git a/xen/arch/x86/hvm/monitor.c b/xen/arch/x86/hvm/monitor.c index 2a41ccc930..8c9d2284d1 100644 --- a/xen/arch/x86/hvm/monitor.c +++ b/xen/arch/x86/hvm/monitor.c @@ -23,8 +23,10 @@ */ #include <xen/vm_event.h> +#include <xen/mem_access.h> #include <xen/monitor.h> #include <asm/hvm/monitor.h> +#include <asm/altp2m.h> #include <asm/monitor.h> #include <asm/paging.h> #include <asm/vm_event.h> @@ -215,6 +217,79 @@ void hvm_monitor_interrupt(unsigned int vector, unsigned int type, monitor_traps(current, 1, &req); } +/* + * Send memory access vm_events based on pfec. Returns true if the event was + * sent and false for p2m_get_mem_access() error, no violation and event send + * error. Assumes the caller will check arch.vm_event->send_event. + * + * NOTE: p2m_get_mem_access() can fail if the entry was not found in the EPT + * (in which case access to it is unrestricted, so no violations can occur). + * In this cases it is fine to continue the emulation. + */ +bool hvm_monitor_check_p2m(unsigned long gla, gfn_t gfn, uint32_t pfec, + uint16_t kind) +{ + xenmem_access_t access; + vm_event_request_t req = {}; + paddr_t gpa = (gfn_to_gaddr(gfn) | (gla & ~PAGE_MASK)); + + ASSERT(current->arch.vm_event->send_event); + + current->arch.vm_event->send_event = false; + + if ( p2m_get_mem_access(current->domain, gfn, &access, + altp2m_vcpu_idx(current)) != 0 ) + return false; + + switch ( access ) + { + case XENMEM_access_x: + case XENMEM_access_rx: + if ( pfec & PFEC_write_access ) + req.u.mem_access.flags = MEM_ACCESS_R | MEM_ACCESS_W; + break; + + case XENMEM_access_w: + case XENMEM_access_rw: + if ( pfec & PFEC_insn_fetch ) + req.u.mem_access.flags = MEM_ACCESS_X; + break; + + case XENMEM_access_r: + case XENMEM_access_n: + if ( pfec & PFEC_write_access ) + req.u.mem_access.flags |= MEM_ACCESS_R | MEM_ACCESS_W; + if ( pfec & PFEC_insn_fetch ) + req.u.mem_access.flags |= MEM_ACCESS_X; + break; + + case XENMEM_access_wx: + case XENMEM_access_rwx: + case XENMEM_access_rx2rw: + case XENMEM_access_n2rwx: + case XENMEM_access_default: + break; + } + + if ( !req.u.mem_access.flags ) + return false; /* no violation */ + + if ( kind == npfec_kind_with_gla ) + req.u.mem_access.flags |= MEM_ACCESS_FAULT_WITH_GLA | + MEM_ACCESS_GLA_VALID; + else if ( kind == npfec_kind_in_gpt ) + req.u.mem_access.flags |= MEM_ACCESS_FAULT_IN_GPT | + MEM_ACCESS_GLA_VALID; + + + req.reason = VM_EVENT_REASON_MEM_ACCESS; + req.u.mem_access.gfn = gfn_x(gfn); + req.u.mem_access.gla = gla; + req.u.mem_access.offset = gpa & ~PAGE_MASK; + + return monitor_traps(current, true, &req) >= 0; +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c index 0144f92b98..94c7f2a80c 100644 --- a/xen/arch/x86/mm/mem_access.c +++ b/xen/arch/x86/mm/mem_access.c @@ -210,10 +210,16 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long gla, return true; } } + + /* + * Try to avoid sending a mem event. Suppress events caused by page-walks + * by emulating but still checking mem_access violations. + */ if ( vm_event_check_ring(d->vm_event_monitor) && d->arch.monitor.inguest_pagefault_disabled && - npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */ + npfec.kind == npfec_kind_in_gpt ) { + v->arch.vm_event->send_event = true; hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op, X86_EVENT_NO_EC); return true; diff --git a/xen/include/asm-x86/hvm/monitor.h b/xen/include/asm-x86/hvm/monitor.h index f1af4f812a..325b44674d 100644 --- a/xen/include/asm-x86/hvm/monitor.h +++ b/xen/include/asm-x86/hvm/monitor.h @@ -49,6 +49,9 @@ void hvm_monitor_interrupt(unsigned int vector, unsigned int type, unsigned int err, uint64_t cr2); bool hvm_monitor_emul_unimplemented(void); +bool hvm_monitor_check_p2m(unsigned long gla, gfn_t gfn, uint32_t pfec, + uint16_t kind); + #endif /* __ASM_X86_HVM_MONITOR_H__ */ /* diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h index 23e655710b..66db9e1e25 100644 --- a/xen/include/asm-x86/vm_event.h +++ b/xen/include/asm-x86/vm_event.h @@ -36,6 +36,8 @@ struct arch_vm_event { bool set_gprs; /* A sync vm_event has been sent and we're not done handling it. */ bool sync_event; + /* Send mem access events from emulator */ + bool send_event; }; int vm_event_init_domain(struct domain *d); -- 2.17.1 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.