[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Delays on usleep calls
On lun, 2014-01-20 at 18:05 +0200, Pavlo Suikov wrote: > > x86 or ARM host? > ARM. ARMv7, TI Jacinto6 to be precise. > Ok. > > Also, how many pCPUs and vCPUs do the host and the various guests > have? > > > 2 pCPUs, 4 vCPUs: 2 vCPU per domain. > Right. So you are overbooking the platform a bit. Don't get me wrong, that's not only legitimate, it's actually a good thing, if only because it gives us something nice to play with, from the Xen scheduling perspective. If you were just having #vCPUs==#pCPUs, that would be way more boring! :-) That being said, is that a problem, as a temporary measure, during this first phase of testing and benchmarking, to change it a bit? I'm asking because I think that could help isolating the various causes of the issues you're seeing, and hence facing and resolving them. > > Are you using any vCPU-to-pCPU pinning? > > No. > Ok, so, if, as said above, you can do that, I'd try the following. With the credit scheduler (after having cleared/disabled the rate limiting thing), go for 1 vCPU in Dom0 and 1 vCPU in DomU. Also, pin both, and do it to different pCPUs. I think booting with this "dom0_max_vcpus=1 dom0_vcpus_pin" in the Xen command line would do the trick for Dom0. For DomU, you just put in the config file a "cpus=X" entry, as soon as you see what it is the pCPU to which Dom0 is _not_ pinned (I suspect Dom0 will end up pinned to pCPU #0, and so you should use "cpus=1" for the DomU). With that configuration, repeat the tests. Basically, what I'm asking you to do is to completely kick the Xen scheduler out of the window, for now, to try getting some baseline numbers. Nicely enough, when using only 1 vCPU for both Dom0 and DomU, you pretty much rule out most of the Linux scheduler's logic (not everything, but at least the part about load balancing). To push even harder on the latter, I'd boost the priority of the test program (I'm still talking about inside the Linux guest) to some high level rtprio. What all the above should give you is an estimation of the current lower bound on latency and jitter that you can get. If that's already not good enough (provided I did not make any glaring mistake in the instructions above :-D), then we know that there are areas other than the scheduler that needs some intervention, and we can start looking for which ones and what to do. Also, whether or not what you get is enough, one can also start working on seeing what scheduler, and/or what set of scheduling parameters, is able to replicate, or get close and reliably enough, to the 'static scenario'. What do you think? > We did additional measurements and as you can see, my first impression > was not very correct: difference between dom0 and domU exist and is > quite observable on a larger scale. On the same setup bare metal > without Xen number of times t > 32 is close to 0; on the setup with > Xen but without domU system running number of times t > 32 is close to > 0 as well. > I appreciate that. Given the many actors and factors involved, I think the only way to figure out what's going on is to try isolating the various components as much as we can... That's why I'm suggesting to consider a very very very simple situation first, at least wrt to scheduling. > We will make additional measurements with Linux (not Android) as a > domU guest, though. > Ok. > > # xl sched-sedf > > # xl sched-sedf > Cpupool Pool-0: > Name ID Period Slice Latency Extra > Weight > Domain-0 0 100 0 0 1 > 0 > android_4.3 1 100 0 0 1 > 0 > May I ask for the output of # xl list -n and # xl vcpu-list in the sEDF case too? That being said, I suggest you not to spend much time on sEDF for now. As it is, it's broken, especially on SMPs, so we either re-engineer it properly, or turn toward RT-Xen (and, e.g., help Sisu and his team to upstream it). I think we should have a discussion about the above, outside and beyond this thread... I'll spring it up in the proper way ASAP. > > Oh, and now that I think about it, something that present in credit > and > > not in sEDF that might be worth checking is the scheduling rate > limiting > > thing. > > > We'll check it out, thanks! > Right. One other thing that I forgot to mention: the timeslice. Credit uses, by default, 30ms as its scheduling timeslice which, I think, is quite high for latency sensitive workloads like yours (Linux typically uses 1, 3.33, 4 or 10). # xl sched-credit Cpupool Pool-0: tslice=30ms ratelimit=1000us Name ID Weight Cap Domain-0 0 256 0 vm.guest.osstest 9 256 0 I think that another thing that is worth trying is running the experiments with that lowered a bit. E.g.: # xl sched-credit -s -t 1 # xl sched-credit Cpupool Pool-0: tslice=1ms ratelimit=1000us Name ID Weight Cap Domain-0 0 256 0 vm.guest.osstest 9 256 0 Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |