[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86: add SSE-based copy_page()
On 12/01/2009 23:29, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote: > I finally got around to measuring this. On my two machines, > an Intel "Weybridge" box and an Intel TBD quadcore box, > the new sse2 code was at best nearly the same for cold cache > and much worse for warm cache. > > I can't explain the sampling variation as I have interrupts off, > a lock held, and pre-warmed TLB... I suppose maybe another > processor could be causing rare TLB misses? But in any case > the min number is probably best for comparison. > > I'm guessing the gcc optimizer for the memcpy code was tuned > for an Intel pipeline... Jan, were you measuring on an > AMD processor? > > I've included the raw data and measurement code below. Seems like unless we dynamically choose the copy routine, we're better off without the SSE2 alternative. Shall I revert it then? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |