[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] Linux process vruntime accounting in Xen
[Adding George again, and a few Linux/Xen folks] On Sat, 2016-05-14 at 18:25 -0600, Tony S wrote: > In virtualized environments, sometimes we need to limit the CPU > resources to a virtual machine(VM). For example in Xen, we use > $ xl sched-credit -d 1 -c 50 > > to limit the CPU resource of dom 1 as half of > one physical CPU core. If the VM CPU resource is capped, the process > inside the VM will have a vruntime accounting problem. Here, I report > my findings about Linux process scheduler under the above scenario. > Thanks for this other report as well. :-) All you say makes sense to me, and I will think about it. I'm not sure about one thing, though... > ------------Description------------ > Linux CFS relies on delta_exec to charge the vruntime of processes. > The variable delta_exec is the difference of a process starts and > stops running on a CPU. This works well in physical machine. However, > in virtual machine under capped resources, some processes might be > accounted with inaccurate vruntime. > > For example, suppose we have a VM which has one vCPU and is capped to > have as much as 50% of a physical CPU. When process A inside the VM > starts running and the CPU resource of that VM runs out, the VM will > be paused. Next round when the VM is allocated new CPU resource and > starts running again, process A stops running and is put back to the > runqueue. The delta_exec of process A is accounted as its "real > execution time" plus the paused time of its VM. That will make the > vruntime of process A much larger than it should be and process A > would not be scheduled again for a long time until the vruntimes of > other > processes catch it. > --------------------------------------- > > > ------------Analysis---------------- > When a process stops running and is going to put back to the > runqueue, > update_curr() will be executed. > [src/kernel/sched/fair.c] > > static void update_curr(struct cfs_rq *cfs_rq) > { > ... ... > delta_exec = now - curr->exec_start; > ... ... > curr->exec_start = now; > ... ... > curr->sum_exec_runtime += delta_exec; > schedstat_add(cfs_rq, exec_clock, delta_exec); > curr->vruntime += calc_delta_fair(delta_exec, curr); > update_min_vruntime(cfs_rq); > ... ... > } > > "now" --> the right now time > "exec_start" --> the time when the current process is put on the CPU > "delta_exec" --> the time difference of a process between it starts > and stops running on the CPU > > When a process starts running before its VM is paused and the process > stops running after its VM is unpaused, the delta_exec will include > the VM suspend time which is pretty large compared to the real > execution time of a process. > ... but would that also apply to a VM that is not scheduled --just because of pCPU contention, not because it was paused-- for a few time? Isn't there anything in place in Xen or Linux (the latter being better suitable for something like this, IMHO) to compensate for that? I have to admit I haven't really ever checked myself, maybe either George or our Linux people do know more? > This issue will make a great performance harm to the victim process. > If the process is an I/O-bound workload, its throughput and latency > will be influenced. If the process is a CPU-bound workload, this > issue > will make its vruntime "unfair" compared to other processes under > CFS. > > Because the CPU resource of some type VMs in the cloud are limited as > the above describes(like Amazon EC2 t2.small instance), I doubt that > will also harm the performance of public cloud instances. > --------------------------------------- > > > My test environment is as follows: Hypervisor(Xen 4.5.0), Dom 0(Linux > 3.18.21), Dom U(Linux 3.18.21). I also test longterm version Linux > 3.18.30 and the latest longterm version, Linux 4.4.7. Those kernels > all have this issue. > > Please confirm this bug. Thanks. > > -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |