Xen project Mailing List

Re: [Xen-devel] Xen on ARM IRQ latency and scheduler overhead

To: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Thu, 16 Feb 2017 11:52:17 -0800 (PST)

Cc: george.dunlap@xxxxxxxxxxxxx, edgar.iglesias@xxxxxxxxxx, julien.grall@xxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Thu, 16 Feb 2017 19:52:35 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, 16 Feb 2017, Dario Faggioli wrote: > On Fri, 2017-02-10 at 10:32 -0800, Stefano Stabellini wrote: > > On Fri, 10 Feb 2017, Dario Faggioli wrote: > > > Right, interesting use case. I'm glad to see there's some interest > > > in > > > it, and am happy to help investigating, and trying to make things > > > better. > > > > Thank you! > > > Hey, FYI, I am looking into this. It's just that I've got a couple of > other things in my plate right now. OK > > > Ok, do you (or anyone) mind explaining in a little bit more details > > > what the app tries to measure and how it does that. > > > > Give a look at app/xen/guest_irq_latency/apu.c: > > > > https://github.com/edgarigl/tbm/blob/master/app/xen/guest_irq_latency > > /apu.c > > > > This is my version which uses the phys_timer (instead of the > > virt_timer): > > > > https://github.com/sstabellini/tbm/blob/phys-timer/app/xen/guest_irq_ > > latency/apu.c > > > Yep, I did look at those. > > > Edgar can jump in to add more info if needed (he is the author of the > > app), but as you can see from the code, the app is very simple. It > > sets > > a timer event in the future, then, after receiving the event, it > > checks > > the current time and compare it with the deadline. > > > Right, and you check the current time with: > > now = aarch64_irq_get_stamp(el); > > which I guess is compatible with the values you use for the counter. Yes > > > > These are the results, in nanosec: > > > > > > > > AVG MIN MAX WARM MAX > > > > > > > > NODEBUG no WFI 1890 1800 3170 2070 > > > > NODEBUG WFI 4850 4810 7030 4980 > > > > NODEBUG no WFI credit2 2217 2090 3420 2650 > > > > NODEBUG WFI credit2 8080 7890 10320 8300 > > > > > > > > DEBUG no WFI 2252 2080 3320 2650 > > > > DEBUG WFI 6500 6140 8520 8130 > > > > DEBUG WFI, credit2 8050 7870 10680 8450 > > > > > > > > DEBUG means Xen DEBUG build. > > > > > [...] > > > > As you can see, depending on whether the guest issues a WFI or > > > > not > > > > while > > > > waiting for interrupts, the results change significantly. > > > > Interestingly, > > > > credit2 does worse than credit1 in this area. > > > > > > > This is with current staging right? > > > > That's right. > > > So, when you have the chance, can I see the output of > > xl debug-key r > xl dmesg > > Both under Credit1 and Credit2? I'll see what I can do. > > > I can try sending a quick patch for disabling the tick when a CPU > > > is > > > idle, but I'd need your help in testing it. > > > > That might be useful, however, if I understand this right, we don't > > actually want a periodic timer in Xen just to make the system more > > responsive, do we? > > > IMO, no. I'd call that an hack, and don't think we should go that > route. > > Not until we have figured out and squeezed as much as possible all the > other sources of latency, and that has proven not to be enough, at > least. > > I'll send the patch. > > > > > Assuming that the problem is indeed the scheduler, one workaround > > > > that > > > > we could introduce today would be to avoid calling vcpu_unblock > > > > on > > > > guest > > > > WFI and call vcpu_yield instead. This change makes things > > > > significantly > > > > better: > > > > > > > > AVG MIN MAX WARM > > > > MAX > > > > DEBUG WFI (yield, no block) 2900 2190 5130 5130 > > > > DEBUG WFI (yield, no block) credit2 3514 2280 6180 5430 > > > > > > > > Is that a reasonable change to make? Would it cause significantly > > > > more > > > > power consumption in Xen (because xen/arch/arm/domain.c:idle_loop > > > > might > > > > not be called anymore)? > > > > > > > Exactly. So, I think that, as Linux has 'idle=poll', it is > > > conceivable > > > to have something similar in Xen, and if we do, I guess it can be > > > implemented as you suggest. > > > > > > But, no, I don't think this is satisfying as default, not before > > > trying > > > to figure out what is going on, and if we can improve things in > > > other > > > ways. > > > > OK. Should I write a patch for that? I guess it would be arm specific > > initially. What do you think it would be a good name for the option? > > > Well, I think such an option may be useful on other arches too, but we > better measure/verify that before. Therefore, I'd be ok for this to be > only implemented on ARM for now. > > As per the name, I actually like the 'idle=', and as values, what about > 'sleep' or 'block' for the current default, and stick to 'poll' for the > new behavior you'll implement? Or do you think it is at risk of > confusion with Linux? > > An alternative would be something like 'wfi=[sleep,idle]', or > 'wfi=[block,poll]', but that is ARM specific, and it'd mean we will > need another option for making x86 behave similarly. That's a good idea. vwfi=[sleep,idle] looks like the right thing to introduce, given that the option would be ARM only at the moment and that it's the virtual wfi not the physical wfi behavior that we are changing. > > > But it would not be much more useful than that, IMO. > > > > Why? Actually I know of several potential users of Xen on ARM > > interested > > exactly in this use-case. They only have a statically defined number > > of > > guests with a total amount of vcpu lower or equal to the number of > > pcpu > > in the system. Wouldn't a scheduler like that help in this scenario? > > > What I'm saying is that would be rather inflexible. In the sense that > it won't be possible to have statically pinned and dynamically moving > vcpus in the same guest, it would be hard to control what vcpu is > statically assigned to what pcpu, making a domain statically assigned > would mean move it to another cpupool (which is the only way to use a > different scheduler, right now, in Xen), and things like this. > > I know there are static use cases... But I'm not entirely sure how > static they really are, and whether they, in the end, will really like > such degree of inflexibility. They are _very_ static :-) Think about the board on a mechanical robot or a drone. VMs are only created at host boot and never again. In fact we are planning to introduce a feature in Xen to be able to create a few VMs directly from the hypervisor, to skip the tools in Dom0 for these cases. > But anyway, indeed I can give you a scheduler that, provided it leaves > in a cpupool with M pcpus, a soon as a new domain with n vcpus is moved > inside the pool, statically assign its n0,n1...,nk,k<=M vcpus to a > pcpu, and always stick with that. And we'll see what will happen! :-) I am looking forward to it.

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.