[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] x86/mm: Suppresses vm_events caused by page-walks

To: Jan Beulich <JBeulich@xxxxxxxx>, Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>
From: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>
Date: Thu, 1 Mar 2018 12:53:26 +0200
Cc: tim@xxxxxxx, sstabellini@xxxxxxxxxx, wei.liu2@xxxxxxxxxx, george.dunlap@xxxxxxxxxxxxx, andrew.cooper3@xxxxxxxxxx, ian.jackson@xxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxx, tamas@xxxxxxxxxxxxx
Delivery-date: Thu, 01 Mar 2018 10:53:58 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 03/01/2018 12:09 PM, Jan Beulich wrote:
>>>> On 01.03.18 at 11:00, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> On 02/28/2018 03:56 PM, Jan Beulich wrote:
>>>>>> On 28.02.18 at 14:41, <JBeulich@xxxxxxxx> wrote:
>>>>>>> On 28.02.18 at 14:25, <aisaila@xxxxxxxxxxxxxxx> wrote:
>>>>> --- a/xen/arch/x86/mm/mem_access.c
>>>>> +++ b/xen/arch/x86/mm/mem_access.c
>>>>> @@ -137,6 +137,23 @@ bool p2m_mem_access_emulate_check(struct vcpu *v,
>>>>>      return violation;
>>>>>  }
>>>>>  
>>>>> +static void p2m_set_ad_bits(struct vcpu *v, paddr_t ga)
>>>>> +{
>>>>> +    struct hvm_hw_cpu ctxt;
>>>>> +    uint32_t pfec = 0;
>>>>> +
>>>>> +    hvm_funcs.save_cpu_ctxt(v, &ctxt);
>>>>> +
>>>>> +    if ( guest_cpu_user_regs()->eip == v->arch.pg_dirty.eip
>>>>> +         && ga == v->arch.pg_dirty.gla )
>>>>> +        pfec = PFEC_write_access;
>>>>> +
>>>>> +    paging_ga_to_gfn_cr3(v, ctxt.cr3, ga, &pfec, NULL);
>>>>> +
>>>>> +    v->arch.pg_dirty.eip = guest_cpu_user_regs()->eip;
>>>>> +    v->arch.pg_dirty.gla = ga;
>>>>> +}
>>>>
>>>> This being the only user of v->arch.pg_dirty, why is the new
>>>> sub-structure made part of struct arch_vcpu instead of
>>>> struct arch_vm_event (or some other sub-structure referenced
>>>> by pointer, such that struct arch_vcpu wouldn't grow in size?
>>>> And even without that, this is HVM-specific, so would at best
>>>> belong into the HVM sub-structure.
>>>>
>>>> The PFEC_write_access logic is completely unclear to me, despite
>>>> the attempt to describe this in the description. I'm pretty sure you
>>>> want a code comment here.
>>
>> The thinking here is this: if we've ended up in p2m_mem_access_check()
>> with npfec.kind != npfec_kind_with_gla, then that's an EPT fault caused
>> by a page walk, that can be performed "manually" once Xen tries to set
>> both the A and the D bits.
>>
>> So when it tries to set the A bit, we mark the attempt by storing the
>> RIP/GLA pair, so that when the function is called again for the same
>> values we know that that's an attempt to set the D bit, and we act on
>> that (so that we don't have to lift the page restrictions on the fault
>> page).
>>
>> If there's a cleaner way to achieve this it would be great.
> 
> "Cleaner" is a secondary goal. A correct way would be desirable
> for starters. I don't see what would prevent coming back here a
> second time because something somewhere having returned
> X86EMUL_RETRY, causing the insn to simply be restarted. This
> _may_ be possible to address by suitably flushing the two values
> under certain conditions.

Clearly corectness should be the goal, however what I had in mind was
that if there was some way we could know that the D bit is being set
without trying again, then this would be both cleaner and obviously correct.

As for X86EMUL_RETRY, one of the points of the patch is that on this
path no vm_event is being sent out at all, so no emulation attempt is
taking place (at least not on the vm_event processing path):

 bool p2m_mem_access_check(paddr_t gpa, unsigned long gla,
                           struct npfec npfec,
                           vm_event_request_t **req_ptr)
@@ -208,6 +225,16 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned
long gla,
         }
     }

+    if ( vm_event_check_ring(d->vm_event_monitor) &&
+         d->arch.monitor.inguest_pagefault_disabled &&
+         npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */
+    {
+        v->arch.vm_event->emulate_flags = 0;
+        p2m_set_ad_bits(v, gla);
+
+        return true;
+    }
+
     *req_ptr = NULL; /* Vm_event path starts here. */
     req = xzalloc(vm_event_request_t);

Of course, that doesn't mean that there's no other possiblity to enter
p2m_set_ad_bits() twice in a way I'm not seeing now.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

References:
- Re: [Xen-devel] [PATCH v3] x86/mm: Suppresses vm_events caused by page-walks
  - From: Razvan Cojocaru
- Re: [Xen-devel] [PATCH v3] x86/mm: Suppresses vm_events caused by page-walks
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [PATCH 3/6] xen/sched: Improvements to the {alloc, free}_domdata() interfaces
Next by Date: Re: [Xen-devel] [PATCH v2 7/7] x86/build: Use new .nop directive when available
Previous by thread: Re: [Xen-devel] [PATCH v3] x86/mm: Suppresses vm_events caused by page-walks
Next by thread: Re: [Xen-devel] [PATCH 1/6] xen/domain: Reduce the quantity of initialisation for system domains
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.