[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about VPID during MOV-TO-CR3



>>> On 23.09.16 at 22:45, <tamas.lengyel@xxxxxxxxxxxx> wrote:
> On Fri, Sep 23, 2016 at 9:50 AM, Tamas K Lengyel
> <tamas.lengyel@xxxxxxxxxxxx> wrote:
>> On Fri, Sep 23, 2016 at 9:39 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>>> On 23.09.16 at 17:26, <tamas.lengyel@xxxxxxxxxxxx> wrote:
>>>> On Fri, Sep 23, 2016 at 2:24 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>>>>> On 22.09.16 at 19:18, <tamas.lengyel@xxxxxxxxxxxx> wrote:
>>>>>> So I verified that when CPU-based load exiting is enabled, the TLB
>>>>>> flush here is critical. Without it the guest kernel crashes at random
>>>>>> points during boot. OTOH why does Xen trap every guest CR3 update
>>>>>> unconditionally? While we have features such as the vm_event/monitor
>>>>>> that may choose to subscribe to that event, Xen traps it even when
>>>>>> that is not in use. Is that trapping necessary for something else?
>>>>>
>>>>> Where do you see this being unconditional? construct_vmcs()
>>>>> clearly avoids setting these intercepts when using EPT. Are you
>>>>> perhaps suffering from
>>>>>
>>>>>             /* Trap CR3 updates if CR3 memory events are enabled. */
>>>>>             if ( v->domain->arch.monitor.write_ctrlreg_enabled &
>>>>>                  monitor_ctrlreg_bitmask(VM_EVENT_X86_CR3) )
>>>>>                 v->arch.hvm_vmx.exec_control |= 
>>>>> CPU_BASED_CR3_LOAD_EXITING;
>>>>>
>>>>> in vmx_update_guest_cr()? That'll be rather something for you
>>>>> or Razvan to explain. Outside of nested VMX I don't see any
>>>>> other enabling of that intercept (didn't check AMD code on the
>>>>> assumption that you're working on Intel hardware).
>>>>
>>>> So there seems to be two separate paths that lead to the TLB flushing.
>>>> One is indeed the above case you cited when we enable CR3 monitoring
>>>> through the monitor interface. However, during domain boot I also see
>>>> this path being called that is not related to the
>>>> CPU_BASED_CR3_LOAD_EXITING:
>>>>
>>>> (XEN) hap.c:739:d1v0 hap_update_paging_modes is calling hap_update_cr3
>>>> (XEN) hap.c:701:d1v0 HAP update cr3 called
>>>> (XEN) /src/xen/xen/include/asm/hvm/hvm.h:344:d1v0 HVM update guest cr3 
> called
>>>> (XEN) vmx.c:1549:d1v0 Update guest CR3 value=0x7a7c4000
>>>>
>>>> This path seems to de-activate once the domain is fully booted.
>>>
>>> This late? According to the CR0 handling in
>>> vmx_update_guest_cr() I would understand it to be enabled only
>>> while the guest is still in real mode (and even then only on old
>>> hardware, i.e. without the Unrestricted Guest functionality).
>>>
>>
>> Right, with unrestricted guest support I would assume none of this
>> would get called - but it does, and quite frequently during domain
>> boot. The CPU is a Intel(R) Xeon(R) CPU E5-2430.
>>
> 
> So I experimented with selectively disabling the flushing such that
> it's done only when coming from a path other then CPU-based CR3 load
> exiting. I've added a bool to struct vcpu that gets set to 0 every
> time vmx_vmexit_handler is called, and only gets set to 1 when
> vmx_cr_access reports a MOV-TO-CR3. Then in the vmx_update_guest_cr
> the flush only happens as such:
> 
>         if ( !v->movtocr3 )
>             hvm_asid_flush_vcpu(v);
> 
> In the guest I run a test application that allocates a page at a fixed
> VA, writes a magic value to it, and then keeps spinning on reading the
> magic value back from the page, checking if it's the same as
> originally supplied. I lunch this application twice with different
> magic values, so that if the TLB invalidation is an issue one of the
> test applications would read back the wrong magic value from the VA
> using a stale TLB entry. I've verified that same VA in the two
> applications point to different pages and that those PTEs are not
> marked global and no PCID is used.
> 
> [  724] test (struct addr:ffff88003730f330). PGD: 0x3731f000
> VADDR 0x5000000 -> PADDR 0x73e35000. Global page: 0
> [  727] test (struct addr:ffff88003681ea20). PGD: 0x777a6000
> VADDR 0x5000000 -> PADDR 0x75043000. Global page: 0

I'm surprised. As said before - a mov-to-CR3 cannot be emulated
without a minimal amount of flushing. No experiments whatsoever
are suitable to prove the contrary.

> Both applications work as expected without the VPID flushing taking
> place. So at least for CPU-based CR3 load exiting it seems that this
> flush is not necessary. As for why this path gets called during domain
> boot when the CPU supports Unrestricted Guest mode and it is properly
> detecting when Xen boots, I'm not sure. However, as we use CPU-based
> CR3 load exiting quite often when doing VMI, I would prefer to disable
> this flushing at least for this case. Any thoughts?

As said before - you'd better direct this question to the VMX
maintainers, and even better would be to first understand why
the intercept remains enabled in the first place. After all it's
quite obvious that most improvement can be expected from not
enabling it at all, whenever possible. Only if it needs to stay
enabled over extended periods of a guest's lifetime it would then
become interesting to see whether the emulation path can be
improved.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.