[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] Bugs existing Xen's credit scheduler cause long tail latency issues



On Tue, May 17, 2016 at 3:27 AM, George Dunlap <dunlapg@xxxxxxxxx> wrote:
> On Sun, May 15, 2016 at 5:11 AM, Tony S <suokunstar@xxxxxxxxx> wrote:
>> Hi all,
>>
>> When I was running latency-sensitive applications in VMs on Xen, I
>> found some bugs in the credit scheduler which will cause long tail
>> latency in I/O-intensive VMs.
>>
>>
>> (1) Problem description
>>
>> ------------Description------------
>> My test environment is as follows: Hypervisor(Xen 4.5.0), Dom 0(Linux
>> 3.18.21), Dom U(Linux 3.18.21).
>>
>> Environment setup:
>> We created two 1-vCPU, 4GB-memory VMs and pinned them onto one
>> physical CPU core. One VM(denoted as I/O-VM) ran Sockperf server
>> program; the other VM ran a compute-bound task, e.g., SPECCPU 2006 or
>> simply a loop(denoted as CPU-VM). A client on another physical machine
>> sent UDP requests to the I/O-VM.
>>
>> Here are my tail latency results (micro-second):
>> Case   Avg      90%       99%        99.9%      99.99%
>> #1     108   &  114    &  128     &  129     &  130
>> #2     7811  &  13892  &  14874   &  15315   &  16383
>> #3     943   &  131    &  21755   &  26453   &  26553
>> #4     116   &  96     &  105     &  8217    &  13472
>> #5     116   &  117    &  129     &  131     &  132
>>
>> Bug 1, 2, and 3 will be discussed below.
>>
>> Case #1:
>> I/O-VM was processing Sockperf requests from clients; CPU-VM was
>> idling (no processes running).
>>
>> Case #2:
>> I/O-VM was processing Sockperf requests from clients; CPU-VM was
>> running a compute-bound task.
>> Hypervisor is the native Xen 4.5.0
>>
>> Case #3:
>> I/O-VM was processing Sockperf requests from clients; CPU-VM was
>> running a compute-bound task.
>> Hypervisor is the native Xen 4.5.0 with bug 1 fixed
>>
>> Case #4:
>> I/O-VM was processing Sockperf requests from clients; CPU-VM was
>> running a compute-bound task.
>> Hypervisor is the native Xen 4.5.0 with bug 1 & 2 fixed
>>
>> Case #5:
>> I/O-VM was processing Sockperf requests from clients; CPU-VM was
>> running a compute-bound task.
>> Hypervisor is the native Xen 4.5.0 with bug 1 & 2 & 3 fixed
>>
>> ---------------------------------------
>>
>>
>> (2) Problem analysis
>
> Hey Tony,
>
> Thanks for looking at this.  These issues in the credit1 algorithm are
> essentially exactly the reason that I started work on the credit2
> scheduler several years ago.  We meant credit2 to have replaced
> credit1 by now, but we ran out of time to test it properly; we're in
> the process of doing that right now, and are hoping it will be the
> default scheduler for the 4.8 release.
>
> So if I could make two suggestions that would help your effort be more
> helpful to us:
>
> 1. Use cpupools for testing rather than pinning. A lot of the
> algorithms are designed with the assumption that they have all the
> cpus to run on, and the credit allocation / priority algorithms fail
> to work properly when they are only pinned.  Cpupools was specifically
> designed to allow the scheduler algorithms to work as designed with a
> smaller number of cpus than the system had.
>
> 2. Test credit2. :-)
>

Hi George,

Thank you for reply. I will try cpupools and credit2 later. :-)


> One comment about your analysis here...
>
>> [Bug2]: In csched_acct() (by default every 30ms), a VCPU stops earning
>> credits and is removed from the active CPU list(in
>> __csched_vcpu_acct_stop_locked) if its credit is larger than the upper
>> bound. Because the domain has only one VCPU and the VM will also be
>> removed from the active domain list.
>>
>> Every 10ms, csched_tick() --> csched_vcpu_acct() -->
>> __csched_vcpu_acct_start() will be executed and tries to put inactive
>> VCPUs back to the active list. However, __csched_vcpu_acct_start()
>> will only put the current VCPU back to the active list. If an
>> I/O-bound VCPU is not the current VCPU at the csched_tick(), it will
>> not be put back to the active VCPU list. If so, the I/O-bound VCPU
>> will likely miss the next credit refill in csched_acct() and can
>> easily enter the OVER state. As such, the I/O-bound VM will be unable
>> to be boosted and have very long latency. It takes at least one time
>> slice (e.g., 30ms) before the I/O VM is activated and starts to
>> receive credits.
>>
>> [Possible Solution] Try to activate any inactive VCPUs back to active
>> before next credit refill, instead of just the current VCPU.
>
> When we stop accounting, we divide the credits in half, so that when
> it starts out, it should have a reasonable amount of credit (15ms
> worth).  Is this not taking effect for some reason?
>

Actually, for bug 2, dividing the credits in half to have a reasonable
credit is not the issue. The problem here is that the VCPU will be
removed from active VCPU list(in __csched_vcpu_acct_stop_locked) and
will not be put back to active list in time sometimes(as I explained
in the first thread). If the VCPU is not active, next time the
csched_acct will not allocate new credits to this VCPU. If many rounds
happened, the credit of this VCPU will be a small negative
number(e.g., -1000) and won't be scheduled. The I/O-intensive
applications on it, especially latency-intensive workloads, will
suffer long tail latency issue.

>  -George



-- 
Tony

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.