[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen on ARM IRQ latency and scheduler overhead
On 18/02/17 00:41, Stefano Stabellini wrote: > On Fri, 17 Feb 2017, Dario Faggioli wrote: >> On Thu, 2017-02-09 at 16:54 -0800, Stefano Stabellini wrote: >>> These are the results, in nanosec: >>> >>> AVG MIN MAX WARM MAX >>> >>> NODEBUG no WFI 1890 1800 3170 2070 >>> NODEBUG WFI 4850 4810 7030 4980 >>> NODEBUG no WFI credit2 2217 2090 3420 2650 >>> NODEBUG WFI credit2 8080 7890 10320 8300 >>> >>> DEBUG no WFI 2252 2080 3320 2650 >>> DEBUG WFI 6500 6140 8520 8130 >>> DEBUG WFI, credit2 8050 7870 10680 8450 >>> >>> As you can see, depending on whether the guest issues a WFI or not >>> while >>> waiting for interrupts, the results change significantly. >>> Interestingly, >>> credit2 does worse than credit1 in this area. >>> >> I did some measuring myself, on x86, with different tools. So, >> cyclictest is basically something very very similar to the app >> Stefano's app. >> >> I've run it both within Dom0, and inside a guest. I also run a Xen >> build (in this case, only inside of the guest). >> >>> We are down to 2000-3000ns. Then, I started investigating the >>> scheduler. >>> I measured how long it takes to run "vcpu_unblock": 1050ns, which is >>> significant. I don't know what is causing the remaining 1000-2000ns, >>> but >>> I bet on another scheduler function. Do you have any suggestions on >>> which one? >>> >> So, vcpu_unblock() calls vcpu_wake(), which then invokes the >> scheduler's wakeup related functions. >> >> If you time vcpu_unblock(), from beginning to end of the function, you >> actually capture quite a few things. E.g., the scheduler lock is taken >> inside vcpu_wake(), so you're basically including time spent waited on >> the lock in the estimation. >> >> That is probably ok (as in, lock contention definitely is something >> relevant to latency), but it is expected for things to be rather >> different between Credit1 and Credit2. >> >> I've, OTOH, tried to time, SCHED_OP(wake) and SCHED_OP(do_schedule), >> and here's the result. Numbers are in cycles (I've used RDTSC) and, for >> making sure to obtain consistent and comparable numbers, I've set the >> frequency scaling governor to performance. >> >> Dom0, [performance] >> cyclictest 1us cyclictest 1ms cyclictest 100ms >> >> (cycles) Credit1 Credit2 Credit1 Credit2 Credit1 Credit2 >> wakeup-avg 2429 2035 1980 1633 2535 1979 >> wakeup-max 14577 113682 15153 203136 12285 115164 > > I am not that familiar with the x86 side of things, but the 113682 and > 203136 look worrisome, especially considering that credit1 doesn't have > them. Dario, Do you reckon those 'MAX' values could be the load balancer running (both for credit1 and credit2)? -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |