Xen project Mailing List

Re: [Xen-devel] [PATCH] vmx/hap: optimize CR4 trapping

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Fri, 16 Feb 2018 09:46:44 +0000

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, Mihai Donțu <mdontu@xxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Fri, 16 Feb 2018 09:47:42 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Feb 15, 2018 at 06:57:46PM +0000, Andrew Cooper wrote: > On 15/02/18 16:25, Roger Pau Monne wrote: > > There a bunch of bits in CR4 that should be allowed to be set directly > > by the guest without requiring Xen intervention, currently this is > > already done by passing through guest writes into the CR4 used when > > running in non-root mode, but taking an expensive vmexit in order to > > do so. > > > > xenalyze reports the following when running a PV guest in shim mode: > > > > CR_ACCESS 3885950 6.41s 17.04% 3957 cyc { 2361| 3378| 7920} > > cr4 3885940 6.41s 17.04% 3957 cyc { 2361| 3378| 7920} > > cr3 1 0.00s 0.00% 3480 cyc { 3480| 3480| 3480} > > *[ 0] 1 0.00s 0.00% 3480 cyc { 3480| 3480| 3480} > > cr0 7 0.00s 0.00% 7112 cyc { 3248| 5960|17480} > > clts 2 0.00s 0.00% 4588 cyc { 3456| 5720| 5720} > > > > After this change this turns into: > > > > CR_ACCESS 12 0.00s 0.00% 9972 cyc { 3680|11024|24032} > > cr4 2 0.00s 0.00% 17528 cyc {11024|24032|24032} > > cr3 1 0.00s 0.00% 3680 cyc { 3680| 3680| 3680} > > *[ 0] 1 0.00s 0.00% 3680 cyc { 3680| 3680| 3680} > > cr0 7 0.00s 0.00% 9209 cyc { 4184| 7848|17488} > > clts 2 0.00s 0.00% 8232 cyc { 5352|11112|11112} > > > > Note that this optimized trapping is currently only applied to guests > > running with HAP on Intel hardware. If using shadow paging more CR4 > > bits need to be unconditionally trapped, which makes this approach > > unlikely to yield any important performance improvements. > > > > Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > --- > > Cc: Jun Nakajima <jun.nakajima@xxxxxxxxx> > > Cc: Kevin Tian <kevin.tian@xxxxxxxxx> > > Cc: Jan Beulich <jbeulich@xxxxxxxx> > > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > > Cc: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> > > Cc: Tamas K Lengyel <tamas@xxxxxxxxxxxxx> > > --- > > xen/arch/x86/hvm/vmx/vmx.c | 41 +++++++++++++++++++++++++++++++++++++++++ > > xen/arch/x86/hvm/vmx/vvmx.c | 5 ++++- > > xen/arch/x86/monitor.c | 5 +++-- > > 3 files changed, 48 insertions(+), 3 deletions(-) > > > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > > index d35cf55982..9747b2a398 100644 > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -1684,6 +1684,35 @@ static void vmx_update_guest_cr(struct vcpu *v, > > unsigned int cr) > > } > > > > __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]); > > + > > + if ( (v->domain->arch.monitor.write_ctrlreg_enabled & > > + monitor_ctrlreg_bitmask(VM_EVENT_X86_CR4)) || > > + !paging_mode_hap(v->domain) ) > > + /* > > + * If requested by introspection or running in shadow mode > > trap all > > + * accesses to CR4. > > The monitor write_ctrlreg_onchangeonly feature was purposefully > introduced to avoid sending PGE updates to the introspection agent. It > would be ideal to include that in the mask calculation so introspection > cases don't vmexit for PGE changes. Oh, haven't realized about that, will include it. > Also, AMD has similar capabilities, and (as of today) has gained CR > monitoring. OK, will have to find some AMD hardware and prepare a similar patch, but that's certainly for later. > > + * > > + * NB: shadow path has not been optimized because it requires > > + * unconditionally trapping more CR4 bits, at which point the > > + * performance benefit of doing this is quite dubious. > > + */ > > + __vmwrite(CR4_GUEST_HOST_MASK, ~0UL); > > It would be better to stash the calculated mask, rather than reading it > back in the vmexit handler. Right, I guess for coherency I should introduce a mask_cr[4] array to hvm_vcpu. I could do that in a pre-patch if it sounds fine. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.