Xen project Mailing List

Re: [Xen-devel] [PATCH v13] x86/emulate: Send vm_event from emulate

To: 'Alexandru Stefan ISAILA' <aisaila@xxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>

Date: Mon, 23 Sep 2019 13:05:21 +0000

Accept-language: en-GB, en-US

Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=Paul.Durrant@xxxxxxxxxx; spf=Pass smtp.mailfrom=Paul.Durrant@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx

Cc: Petre Ovidiu PIRCALABU <ppircalabu@xxxxxxxxxxxxxxx>, "tamas@xxxxxxxxxxxxx" <tamas@xxxxxxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>, Razvan COJOCARU <rcojocaru@xxxxxxxxxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>

Delivery-date: Mon, 23 Sep 2019 13:05:30 +0000

Ironport-sdr: wmcxkGfSYcnW8gjaFU1ZOTRKw6DrDG/GiQRbKtoMurPGquINBbNlrYGuPHWCXwQMLS/Y3nRPMV 9IwjRFAEAIzoX4YAbMDetjazMVxu+xCbzGEsOP4Nde2jXkr0nVz13oShzuwIRrM/QTTbsvh4OR avDsRmdkY0EkBa3U00Y/FarJVS8wycOyLM2FzjfRClzDtNBAIEe7tfPtTNFSePjD2BScOzW45f Vepp6amhoYL+la7/NTOCAfpopnN5jDqC2a09ML4j8AqA/Yuyesb5fDNW8k1OO7pf+7eWDLQEKm 9/Q=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHVcgc/RjZRGBRqJUWCjIFeFraK9Kc5OsKQ

Thread-topic: [PATCH v13] x86/emulate: Send vm_event from emulate

> -----Original Message----- > From: Alexandru Stefan ISAILA <aisaila@xxxxxxxxxxxxxxx> > Sent: 23 September 2019 13:06 > To: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; jbeulich@xxxxxxxx; Andrew Cooper > <Andrew.Cooper3@xxxxxxxxxx>; wl@xxxxxxx; Roger Pau Monne > <roger.pau@xxxxxxxxxx>; Razvan COJOCARU > <rcojocaru@xxxxxxxxxxxxxxx>; tamas@xxxxxxxxxxxxx; Alexandru Stefan ISAILA > <aisaila@xxxxxxxxxxxxxxx>; > Petre Ovidiu PIRCALABU <ppircalabu@xxxxxxxxxxxxxxx>; George Dunlap > <George.Dunlap@xxxxxxxxxx> > Subject: [PATCH v13] x86/emulate: Send vm_event from emulate > > A/D bit writes (on page walks) can be considered benign by an introspection > agent, so receiving vm_events for them is a pessimization. We try here to > optimize by filtering these events out. > Currently, we are fully emulating the instruction at RIP when the hardware > sees > an EPT fault with npfec.kind != npfec_kind_with_gla. This is, however, > incorrect, because the instruction at RIP might legitimately cause an > EPT fault of its own while accessing a _different_ page from the original one, > where A/D were set. > The solution is to perform the whole emulation, while ignoring EPT > restrictions > for the walk part, and taking them into account for the "actual" emulating of > the instruction at RIP. When we send out a vm_event, we don't want the > emulation > to complete, since in that case we won't be able to veto whatever it is doing. > That would mean that we can't actually prevent any malicious activity, instead > we'd only be able to report on it. > When we see a "send-vm_event" case while emulating, we need to first send the > event out and then suspend the emulation (return X86EMUL_RETRY). > After the emulation stops we'll call hvm_vm_event_do_resume() again after the > introspection agent treats the event and resumes the guest. There, the > instruction at RIP will be fully emulated (with the EPT ignored) if the > introspection application allows it, and the guest will continue to run past > the instruction. > > A common example is if the hardware exits because of an EPT fault caused by a > page walk, p2m_mem_access_check() decides if it is going to send a vm_event. > If the vm_event was sent and it would be treated so it runs the instruction > at RIP, that instruction might also hit a protected page and provoke a > vm_event. > > Now if npfec.kind == npfec_kind_in_gpt and > d->arch.monitor.inguest_pagefault_disabled > is true then we are in the page walk case and we can do this emulation > optimization > and emulate the page walk while ignoring the EPT, but don't ignore the EPT > for the > emulation of the actual instruction. > > In the first case we would have 2 EPT events, in the second case we would have > 1 EPT event if the instruction at the RIP triggers an EPT event. > > We use hvmemul_map_linear_addr() to intercept write access and > __hvm_copy() to intercept exec, read and write access. > > A new return type was added, HVMTRANS_need_retry, in order to have all > the places that consume HVMTRANS* return X86EMUL_RETRY. > > hvm_emulate_send_vm_event() can return false if there was no violation, > if there was an error from monitor_traps() or p2m_get_mem_access(). > -ESRCH from p2m_get_mem_access() is treated as restricted access. > > NOTE: hvm_emulate_send_vm_event() assumes the caller will enable/disable > arch.vm_event->send_event > > Signed-off-by: Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx> > emulate parts... Acked-by: Paul Durrant <paul@xxxxxxx> > --- > Changes since V12: > - Update commit message > - Rework switch() in map_linear_addr()/rep_movs()/rep_stos(). > --- > xen/arch/x86/hvm/emulate.c | 57 +++++++++++++++++----- > xen/arch/x86/hvm/hvm.c | 9 ++++ > xen/arch/x86/hvm/intercept.c | 2 + > xen/arch/x86/hvm/monitor.c | 78 +++++++++++++++++++++++++++++++ > xen/arch/x86/mm/mem_access.c | 9 +++- > xen/arch/x86/mm/shadow/hvm.c | 1 + > xen/include/asm-x86/hvm/monitor.h | 3 ++ > xen/include/asm-x86/hvm/support.h | 1 + > xen/include/asm-x86/vm_event.h | 2 + > 9 files changed, 149 insertions(+), 13 deletions(-) > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 36bcb526d3..85ab009717 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -548,6 +548,7 @@ static void *hvmemul_map_linear_addr( > unsigned int nr_frames = ((linear + bytes - !!bytes) >> PAGE_SHIFT) - > (linear >> PAGE_SHIFT) + 1; > unsigned int i; > + gfn_t gfn; > > /* > * mfn points to the next free slot. All used slots have a page > reference > @@ -582,7 +583,7 @@ static void *hvmemul_map_linear_addr( > ASSERT(mfn_x(*mfn) == 0); > > res = hvm_translate_get_page(curr, addr, true, pfec, > - &pfinfo, &page, NULL, &p2mt); > + &pfinfo, &page, &gfn, &p2mt); > > switch ( res ) > { > @@ -599,8 +600,15 @@ static void *hvmemul_map_linear_addr( > err = NULL; > goto out; > > - case HVMTRANS_gfn_paged_out: > + case HVMTRANS_need_retry: > + /* > + * hvm_translate_get_page() does not return HVMTRANS_need_retry. > + * It can dropped if future work requires this. > + */ > + ASSERT_UNREACHABLE(); > + /* fall through */ > case HVMTRANS_gfn_shared: > + case HVMTRANS_gfn_paged_out: > err = ERR_PTR(~X86EMUL_RETRY); > goto out; > > @@ -626,6 +634,14 @@ static void *hvmemul_map_linear_addr( > > ASSERT(p2mt == p2m_ram_logdirty || !p2m_is_readonly(p2mt)); > } > + > + if ( unlikely(curr->arch.vm_event) && > + curr->arch.vm_event->send_event && > + hvm_monitor_check_p2m(addr, gfn, pfec, npfec_kind_with_gla) ) > + { > + err = ERR_PTR(~X86EMUL_RETRY); > + goto out; > + } > } > > /* Entire access within a single frame? */ > @@ -1141,6 +1157,7 @@ static int linear_read(unsigned long addr, unsigned int > bytes, void *p_data, > > case HVMTRANS_gfn_paged_out: > case HVMTRANS_gfn_shared: > + case HVMTRANS_need_retry: > return X86EMUL_RETRY; > } > > @@ -1192,6 +1209,7 @@ static int linear_write(unsigned long addr, unsigned > int bytes, void *p_data, > > case HVMTRANS_gfn_paged_out: > case HVMTRANS_gfn_shared: > + case HVMTRANS_need_retry: > return X86EMUL_RETRY; > } > > @@ -1852,19 +1870,27 @@ static int hvmemul_rep_movs( > > xfree(buf); > > - if ( rc == HVMTRANS_gfn_paged_out ) > - return X86EMUL_RETRY; > - if ( rc == HVMTRANS_gfn_shared ) > - return X86EMUL_RETRY; > - if ( rc != HVMTRANS_okay ) > + switch ( rc ) > { > - gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%" > - PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n", > - sgpa, dgpa, *reps, bytes_per_rep); > - return X86EMUL_UNHANDLEABLE; > + case HVMTRANS_need_retry: > + /* > + * hvm_copy_to_guest_phys() does not return HVMTRANS_need_retry. > + * It can dropped if future work requires this. > + */ > + ASSERT_UNREACHABLE(); > + /* fall through */ > + case HVMTRANS_gfn_paged_out: > + case HVMTRANS_gfn_shared: > + return X86EMUL_RETRY; > + case HVMTRANS_okay: > + return X86EMUL_OKAY; > } > > - return X86EMUL_OKAY; > + gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%" > + PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n", > + sgpa, dgpa, *reps, bytes_per_rep); > + > + return X86EMUL_UNHANDLEABLE; > } > > static int hvmemul_rep_stos( > @@ -1966,6 +1992,13 @@ static int hvmemul_rep_stos( > > switch ( rc ) > { > + case HVMTRANS_need_retry: > + /* > + * hvm_copy_to_guest_phys() does not return HVMTRANS_need_retry. > + * It can dropped if future work requires this. > + */ > + ASSERT_UNREACHABLE(); > + /* fall through */ > case HVMTRANS_gfn_paged_out: > case HVMTRANS_gfn_shared: > return X86EMUL_RETRY; > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index fdb1e17f59..c82e7b2cd3 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -3236,6 +3236,15 @@ static enum hvm_translation_result __hvm_copy( > return HVMTRANS_bad_gfn_to_mfn; > } > > + if ( unlikely(v->arch.vm_event) && > + (flags & HVMCOPY_linear) && > + v->arch.vm_event->send_event && > + hvm_monitor_check_p2m(addr, gfn, pfec, npfec_kind_with_gla) ) > + { > + put_page(page); > + return HVMTRANS_need_retry; > + } > + > p = (char *)__map_domain_page(page) + (addr & ~PAGE_MASK); > > if ( flags & HVMCOPY_to_guest ) > diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c > index aac22c595d..90202bdcec 100644 > --- a/xen/arch/x86/hvm/intercept.c > +++ b/xen/arch/x86/hvm/intercept.c > @@ -145,6 +145,7 @@ int hvm_process_io_intercept(const struct hvm_io_handler > *handler, > case HVMTRANS_bad_linear_to_gfn: > case HVMTRANS_gfn_paged_out: > case HVMTRANS_gfn_shared: > + case HVMTRANS_need_retry: > ASSERT_UNREACHABLE(); > /* fall through */ > default: > @@ -174,6 +175,7 @@ int hvm_process_io_intercept(const struct hvm_io_handler > *handler, > case HVMTRANS_bad_linear_to_gfn: > case HVMTRANS_gfn_paged_out: > case HVMTRANS_gfn_shared: > + case HVMTRANS_need_retry: > ASSERT_UNREACHABLE(); > /* fall through */ > default: > diff --git a/xen/arch/x86/hvm/monitor.c b/xen/arch/x86/hvm/monitor.c > index 2a41ccc930..7fb1e2c04e 100644 > --- a/xen/arch/x86/hvm/monitor.c > +++ b/xen/arch/x86/hvm/monitor.c > @@ -23,8 +23,10 @@ > */ > > #include <xen/vm_event.h> > +#include <xen/mem_access.h> > #include <xen/monitor.h> > #include <asm/hvm/monitor.h> > +#include <asm/altp2m.h> > #include <asm/monitor.h> > #include <asm/paging.h> > #include <asm/vm_event.h> > @@ -215,6 +217,82 @@ void hvm_monitor_interrupt(unsigned int vector, unsigned > int type, > monitor_traps(current, 1, &req); > } > > +/* > + * Send memory access vm_events based on pfec. Returns true if the event was > + * sent and false for p2m_get_mem_access() error, no violation and event send > + * error. Assumes the caller will enable/disable arch.vm_event->send_event. > + */ > +bool hvm_monitor_check_p2m(unsigned long gla, gfn_t gfn, uint32_t pfec, > + uint16_t kind) > +{ > + xenmem_access_t access; > + struct vcpu *curr = current; > + vm_event_request_t req = {}; > + paddr_t gpa = (gfn_to_gaddr(gfn) | (gla & ~PAGE_MASK)); > + int rc; > + > + ASSERT(curr->arch.vm_event->send_event); > + > + /* > + * p2m_get_mem_access() can fail from a invalid MFN and return -ESRCH > + * in which case access must be restricted. > + */ > + rc = p2m_get_mem_access(curr->domain, gfn, &access, > altp2m_vcpu_idx(curr)); > + > + if ( rc == -ESRCH ) > + access = XENMEM_access_n; > + else if ( rc ) > + return false; > + > + switch ( access ) > + { > + case XENMEM_access_x: > + case XENMEM_access_rx: > + if ( pfec & PFEC_write_access ) > + req.u.mem_access.flags = MEM_ACCESS_R | MEM_ACCESS_W; > + break; > + > + case XENMEM_access_w: > + case XENMEM_access_rw: > + if ( pfec & PFEC_insn_fetch ) > + req.u.mem_access.flags = MEM_ACCESS_X; > + break; > + > + case XENMEM_access_r: > + case XENMEM_access_n: > + if ( pfec & PFEC_write_access ) > + req.u.mem_access.flags |= MEM_ACCESS_R | MEM_ACCESS_W; > + if ( pfec & PFEC_insn_fetch ) > + req.u.mem_access.flags |= MEM_ACCESS_X; > + break; > + > + case XENMEM_access_wx: > + case XENMEM_access_rwx: > + case XENMEM_access_rx2rw: > + case XENMEM_access_n2rwx: > + case XENMEM_access_default: > + break; > + } > + > + if ( !req.u.mem_access.flags ) > + return false; /* no violation */ > + > + if ( kind == npfec_kind_with_gla ) > + req.u.mem_access.flags |= MEM_ACCESS_FAULT_WITH_GLA | > + MEM_ACCESS_GLA_VALID; > + else if ( kind == npfec_kind_in_gpt ) > + req.u.mem_access.flags |= MEM_ACCESS_FAULT_IN_GPT | > + MEM_ACCESS_GLA_VALID; > + > + > + req.reason = VM_EVENT_REASON_MEM_ACCESS; > + req.u.mem_access.gfn = gfn_x(gfn); > + req.u.mem_access.gla = gla; > + req.u.mem_access.offset = gpa & ~PAGE_MASK; > + > + return monitor_traps(curr, true, &req) >= 0; > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c > index 0144f92b98..320b9fe621 100644 > --- a/xen/arch/x86/mm/mem_access.c > +++ b/xen/arch/x86/mm/mem_access.c > @@ -210,11 +210,18 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long > gla, > return true; > } > } > + > + /* > + * Try to avoid sending a mem event. Suppress events caused by page-walks > + * by emulating but still checking mem_access violations. > + */ > if ( vm_event_check_ring(d->vm_event_monitor) && > d->arch.monitor.inguest_pagefault_disabled && > - npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */ > + npfec.kind == npfec_kind_in_gpt ) > { > + v->arch.vm_event->send_event = true; > hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op, > X86_EVENT_NO_EC); > + v->arch.vm_event->send_event = false; > > return true; > } > diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c > index 0aa560b7f5..48dfad4557 100644 > --- a/xen/arch/x86/mm/shadow/hvm.c > +++ b/xen/arch/x86/mm/shadow/hvm.c > @@ -139,6 +139,7 @@ hvm_read(enum x86_segment seg, > return X86EMUL_UNHANDLEABLE; > case HVMTRANS_gfn_paged_out: > case HVMTRANS_gfn_shared: > + case HVMTRANS_need_retry: > return X86EMUL_RETRY; > } > > diff --git a/xen/include/asm-x86/hvm/monitor.h > b/xen/include/asm-x86/hvm/monitor.h > index f1af4f812a..325b44674d 100644 > --- a/xen/include/asm-x86/hvm/monitor.h > +++ b/xen/include/asm-x86/hvm/monitor.h > @@ -49,6 +49,9 @@ void hvm_monitor_interrupt(unsigned int vector, unsigned > int type, > unsigned int err, uint64_t cr2); > bool hvm_monitor_emul_unimplemented(void); > > +bool hvm_monitor_check_p2m(unsigned long gla, gfn_t gfn, uint32_t pfec, > + uint16_t kind); > + > #endif /* __ASM_X86_HVM_MONITOR_H__ */ > > /* > diff --git a/xen/include/asm-x86/hvm/support.h > b/xen/include/asm-x86/hvm/support.h > index e989aa7349..1500e6c94b 100644 > --- a/xen/include/asm-x86/hvm/support.h > +++ b/xen/include/asm-x86/hvm/support.h > @@ -61,6 +61,7 @@ enum hvm_translation_result { > HVMTRANS_unhandleable, > HVMTRANS_gfn_paged_out, > HVMTRANS_gfn_shared, > + HVMTRANS_need_retry, > }; > > /* > diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h > index 23e655710b..66db9e1e25 100644 > --- a/xen/include/asm-x86/vm_event.h > +++ b/xen/include/asm-x86/vm_event.h > @@ -36,6 +36,8 @@ struct arch_vm_event { > bool set_gprs; > /* A sync vm_event has been sent and we're not done handling it. */ > bool sync_event; > + /* Send mem access events from emulator */ > + bool send_event; > }; > > int vm_event_init_domain(struct domain *d); > -- > 2.17.1 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.