[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] [PATCH 4/5][RFC] lwp: adding support for AMD lightweight profiling
Hi Jan, No there is no quick way (like TS bit) to keep track LWP state. Here is an excerpt from lwp spec: "LWP does not support the "lazy" state save and restore that is possible for floating point and SSE state. It does not interact with the CR0.TS bit. Operating systems that support LWP must always do an XSAVE to preserve the old thread's LWP context and an XRSTOR to set up the new LWP context. The OS can continue to do a lazy switch of the FP and SSE state by ensuring that the corresponding bits in EDX:EAX are clear when it executes the XSAVE and XRSTOR to handle the LWP context." Thanks, -Wei -----Original Message----- From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx] Sent: Monday, February 14, 2011 3:12 AM To: Huang2, Wei Cc: Gang Wei; xen-devel@xxxxxxxxxxxxxxxxxxx; KeirFraser Subject: Re: [Xen-devel] [PATCH 4/5][RFC] lwp: adding support for AMD lightweight profiling >>> On 11.02.11 at 17:29, Wei Huang <wei.huang2@xxxxxxx> wrote: >--- a/xen/arch/x86/i387.c Thu Feb 10 16:25:09 2011 -0600 >+++ b/xen/arch/x86/i387.c Thu Feb 10 16:55:27 2011 -0600 >@@ -65,6 +65,55 @@ > static void init_fpu(void); > static void restore_fpu(struct vcpu *v); > >+/* Save AMD LWP */ >+void xsave_lwp(struct vcpu *v) >+{ >+ uint64_t lwpcb; >+ bool_t ts; >+ struct xsave_struct *xsave_area = v->arch.xsave_area; >+ >+ if ( cpu_has_lwp ) >+ { >+ /* Has LWP been used? */ >+ rdmsrl(MSR_AMD_LWP_CBADDR, lwpcb); There's no way to track LWP-using state for a vCPU, is there? rdmsr seems pretty expensive for being used in the context switch unconditionally (on CPUs supporting LWP)... >+ if ( !lwpcb ) { >+ /* Guest might have turned off LWP. So clean the bit here. */ >+ xsave_area->xsave_hdr.xstate_bv &= ~XSTATE_LWP; >+ return; >+ } >+ >+ /* Disable TS temporarily to avoid recursion. */ >+ ts = read_cr0() & X86_CR0_TS; >+ clts(); >+ xsave(v, XSTATE_LWP); >+ if ( ts ) >+ stts(); Together with the xrstor_lwp() ones, quite a few manipulations of CR0, and hence making context switch between two LWP-using vcpus pretty expensive. I'm sure some of this redundancy can be eliminated. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |