[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH][RFC] FPU LWP 0/5: patch description



Hi Keir,

I ran a quick test to calculate the overhead of __fpu_unlazy_save() and __fpu_unlazy_restore(), which are used to save/restore LWP state. Here are the results:

(1) tsc_total: total time used for context_switch() in x86/domain.c
(2) tsc_unlazy: total time used for __fpu_unlazy_save() + __fpu_unlazy_retore()

One example:
(XEN) tsc_unlazy=0x00000000008ae174
(XEN) tsc_total=0x00000001028b4907

So the overhead is about 0.2% of total time used by context_switch(). Of course, this is just one example. I would say the overhead ratio would be <1% for most cases.

Thanks,
-Wei



On 04/14/2011 04:09 PM, Keir Fraser wrote:
On 14/04/2011 21:37, "Wei Huang"<wei.huang2@xxxxxxx>  wrote:

The following patches support AMD lightweight profiling.

Because LWP isn't tracked by CR0.TS bit, we clean up the FPU code to
handle lazy and unlazy FPU states differently. Lazy FPU state (such as
SSE, YMM) is handled when #NM is triggered. Unlazy state, such as LWP,
is saved and restored on each vcpu context switch. To simplify the code,
we also add a mask option to xsave/xrstor function.
How much cost is added to context switch paths in the (overwhelmingly
likely) case that LWP is not being used by the guest? Is this adding a whole
lot of unconditional overhead for a feature that noone uses?

  -- Keir

Thanks,
-Wei



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel





_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.