[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] More network tests with xenoprofile this time
Santos, Jose Renato G wrote: Andrew, You may want to take a look at the folowing paper which is being presented at VEE'05 (June 11 and 12, 2005). http://www.hpl.hp.com/research/dca/system/papers/xenoprof-vee05.pdf It presents network performance results using xenoprof. This was done for xen 2.0.3. The profile you reported has some similarities with our results although the exact numbers are different. But that is expected, since you are running a different version of Xen on a differenthardware. We have seen that a significant amount of time was spent on handling interrupts in Xen, as well. We have also seen that a significant amount of time is spent on the hypervisor (+/- 40%) for the dom1 <-> externalcase, measured both at dom1 and at dom0. (in our case we instrumented the receive side) When we run the benchmark on dom0 the time spent on Xenis reduced to (+/-20%). Most of this extra Xen overhead when running a guestseems to come from the page transfer between domain 0 and the guest (see table 6 and discussion on paper).The paper omits the complete oprofile reportsfor brevity. I will be happy to send you any detailed oprofile report we have generated for thepaper, if you want to compare it with your results. Just let me know ...Renato Hi Renato, The article was an interesting application of the xenoprof.It seem like it would be useful to also have data collected using the cycle counts (GLOBAL_POWER_EVENTS on P4) to give some indication of areas with high overhead operations. There may be some areas with few very expensive instructions. Calling attention to those areas would help improve performance. The increases in I-TLB and D-TLB events for Xen-domain0 shown in Figure 4 are surprising. Why would the working sets be that much larger for Xen-domain0 than regular linux, particularly for code? Is there an table similar to table 3 for I-TLB event sample locations? Can't the VMM use a 4-MB page and the Xen-domain0 kernel shouldn't be that much larger than regular linux kernel? How were TLB flushes ruled out as a cause? Could the PERFCOUNTER_CPU counters in perfc_defn.h be used to see if the VMM is doing a lot of TLB flushes? Also how much of I-TLB and D-TLB events are due to the P4 architecture? Are the results so dramatic for a Athlon or AMD64 processors? -Will _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |