[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler
On Tue, Oct 27, 2015 at 3:44 AM, George Dunlap <dunlapg@xxxxxxxxx> wrote: > On Tue, Oct 27, 2015 at 5:59 AM, suokun <suokunstar@xxxxxxxxx> wrote: >> Hi all, >> >> The BOOST mechanism in Xen credit scheduler is designed to prioritize >> VM which has I/O-intensive application to handle the I/O request in >> time. However, this does not always work as expected. > > Thanks for the exploration, and the analysis. > > The BOOST mechanism is part of the reason I began to write the credit2 > scheduler, which we are hoping (any day now) to make the default > scheduler. It was designed specifically with the workload you mention > in mind. Would you care to try your test again and see how it fares? > Hi, George, Thank you for your reply. I have test credit2 this morning. The I/O performance is correct, however, the CPU accounting seems not correct. Here is my experiment on credit2: VM-IO: 1-vCPU pinned to a pCPU, running netperf VM-CPU: 1-vCPU pinned the the same pCPU, running a while(1) loop The throughput of netperf is the same(941Mbps) as VM-IO runs alone. However, when I use xl top to show the VM CPU utilization, VM-IO takes 73% of CPU time and VM-CPU takes 99% CPU time. Their sum is more than 100%. I doubt it is due to the CPU utilization accounting in credit2 scheduler. > Also, do you have a patch to fix it in credit1? :-) > For the patch to my problem in credit1. I have two ideas: 1) if the vCPU cannot migrate(e.g. pinned, CPU affinity, even only has one physical CPU), do not set the _VPF_migrating flag. 2) let the BOOST state can preempt with each other. Actually I have tested both separately and they both work. But personally I prefer the first option because it solved the problem from the source. Best Tony > -George > >> >> >> (1) Problem description >> -------------------------------- >> Suppose two VMs(named VM-I/O and VM-CPU) both have one virtual CPU and >> they are pinned to the same physical CPU. An I/O-intensive >> application(e.g. Netperf) runs in the VM-I/O and a CPU-intensive >> application(e.g. Loop) runs in the VM-CPU. When a client is sending >> I/O requests to VM-I/O, its vCPU cannot become BOOST state but obtains >> very little CPU cycles(less than 1% in Xen 4.6). Both the throughput >> and latency are very terrible. >> >> >> >> (2) Problem analysis >> -------------------------------- >> This problem is due to the wake mechanism in Xen and CPU-intensive >> workload will be waked and boosted by mistake. >> >> Suppose the vCPU of VM-CPU is running and an I/O request comes, the >> current vCPU(vCPU of VM-CPU) will be marked as _VPF_migrating. >> >> static inline void __runq_tickle(unsigned int cpu, struct csched_vcpu *new) >> { >> ... >> if ( new_idlers_empty && new->pri > cur->pri ) >> { >> SCHED_STAT_CRANK(tickle_idlers_none); >> SCHED_VCPU_STAT_CRANK(cur, kicked_away); >> SCHED_VCPU_STAT_CRANK(cur, migrate_r); >> SCHED_STAT_CRANK(migrate_kicked_away); >> set_bit(_VPF_migrating, &cur->vcpu->pause_flags); >> __cpumask_set_cpu(cpu, &mask); >> } >> } >> >> >> next time when the schedule happens and the prev is the vCPU of >> VM-CPU, the context_saved(vcpu) will be executed. Because the vCPU has >> been marked as _VPF_migrating and it will then be waked up. >> >> void context_saved(struct vcpu *prev) >> { >> ... >> >> if ( unlikely(test_bit(_VPF_migrating, &prev->pause_flags)) ) >> vcpu_migrate(prev); >> } >> >> Once the state of vCPU of VM-CPU is UNDER, it will be changed into >> BOOST state which is designed originally for I/O-intensive vCPU. If >> this happen, even though the vCPU of VM-I/O becomes BOOST, it cannot >> get the physical CPU immediately but wait until the vCPU of VM-CPU is >> scheduled out. That will harm the I/O performance significantly. >> >> >> >> (3) Our Test results >> -------------------------------- >> Hypervisor: Xen 4.6 >> Dom 0 & Dom U: Linux 3.18 >> Client: Linux 3.18 >> Network: 1 Gigabit Ethernet >> >> Throughput: >> Only VM-I/O: 941 Mbps >> co-Run VM-I/O and VM-CPU: 32 Mbps >> >> Latency: >> Only VM-I/O: 78 usec >> co-Run VM-I/O and VM-CPU: 109093 usec >> >> >> >> This bug has been there since Xen 4.2 and still exists in the latest Xen 4.6. >> Thanks. >> Reported by Tony Suo and Yong Zhao from UCCS >> >> -- >> >> ********************************** >>> Tony Suo >>> Email: suokunstar@xxxxxxxxx >>> University of Colorado at Colorado Springs >> ********************************** >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxx >> http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |