[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH 4/5][RFC] lwp: adding support for AMD lightweight profiling



Hi Jan,

No there is no quick way (like TS bit) to keep track LWP state. Here is an 
excerpt from lwp spec:

"LWP does not support the "lazy" state save and restore that is possible for 
floating point and SSE state. It does not interact with the CR0.TS bit. 
Operating systems that support LWP must always do an XSAVE to preserve the old 
thread's LWP context and an XRSTOR to set up the new LWP context. The OS can 
continue to do a lazy switch of the FP and SSE state by ensuring that the 
corresponding bits in EDX:EAX are clear when it executes the XSAVE and XRSTOR 
to handle the LWP context."

Thanks,
-Wei

-----Original Message-----
From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx] 
Sent: Monday, February 14, 2011 3:12 AM
To: Huang2, Wei
Cc: Gang Wei; xen-devel@xxxxxxxxxxxxxxxxxxx; KeirFraser
Subject: Re: [Xen-devel] [PATCH 4/5][RFC] lwp: adding support for AMD 
lightweight profiling

>>> On 11.02.11 at 17:29, Wei Huang <wei.huang2@xxxxxxx> wrote:
>--- a/xen/arch/x86/i387.c      Thu Feb 10 16:25:09 2011 -0600
>+++ b/xen/arch/x86/i387.c      Thu Feb 10 16:55:27 2011 -0600
>@@ -65,6 +65,55 @@
> static void init_fpu(void);
> static void restore_fpu(struct vcpu *v);
> 
>+/* Save AMD LWP */
>+void xsave_lwp(struct vcpu *v)
>+{
>+    uint64_t lwpcb;
>+    bool_t ts;
>+    struct xsave_struct *xsave_area = v->arch.xsave_area;
>+
>+    if ( cpu_has_lwp )
>+    {
>+        /* Has LWP been used? */
>+        rdmsrl(MSR_AMD_LWP_CBADDR, lwpcb);

There's no way to track LWP-using state for a vCPU, is there?
rdmsr seems pretty expensive for being used in the context
switch unconditionally (on CPUs supporting LWP)...

>+        if ( !lwpcb ) {
>+            /* Guest might have turned off LWP. So clean the bit here. */
>+            xsave_area->xsave_hdr.xstate_bv &= ~XSTATE_LWP;
>+            return;
>+        }
>+        
>+        /* Disable TS temporarily to avoid recursion. */
>+        ts = read_cr0() & X86_CR0_TS;
>+        clts();
>+        xsave(v, XSTATE_LWP);
>+        if ( ts )
>+            stts();

Together with the xrstor_lwp() ones, quite a few manipulations of
CR0, and hence making context switch between two LWP-using
vcpus pretty expensive. I'm sure some of this redundancy can be
eliminated.

Jan




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.