[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] Bugs existing Xen's credit scheduler cause long tail latency issues
On Sun, May 15, 2016 at 5:11 AM, Tony S <suokunstar@xxxxxxxxx> wrote: > Hi all, > > When I was running latency-sensitive applications in VMs on Xen, I > found some bugs in the credit scheduler which will cause long tail > latency in I/O-intensive VMs. > > > (1) Problem description > > ------------Description------------ > My test environment is as follows: Hypervisor(Xen 4.5.0), Dom 0(Linux > 3.18.21), Dom U(Linux 3.18.21). > > Environment setup: > We created two 1-vCPU, 4GB-memory VMs and pinned them onto one > physical CPU core. One VM(denoted as I/O-VM) ran Sockperf server > program; the other VM ran a compute-bound task, e.g., SPECCPU 2006 or > simply a loop(denoted as CPU-VM). A client on another physical machine > sent UDP requests to the I/O-VM. > > Here are my tail latency results (micro-second): > Case Avg 90% 99% 99.9% 99.99% > #1 108 & 114 & 128 & 129 & 130 > #2 7811 & 13892 & 14874 & 15315 & 16383 > #3 943 & 131 & 21755 & 26453 & 26553 > #4 116 & 96 & 105 & 8217 & 13472 > #5 116 & 117 & 129 & 131 & 132 > > Bug 1, 2, and 3 will be discussed below. > > Case #1: > I/O-VM was processing Sockperf requests from clients; CPU-VM was > idling (no processes running). > > Case #2: > I/O-VM was processing Sockperf requests from clients; CPU-VM was > running a compute-bound task. > Hypervisor is the native Xen 4.5.0 > > Case #3: > I/O-VM was processing Sockperf requests from clients; CPU-VM was > running a compute-bound task. > Hypervisor is the native Xen 4.5.0 with bug 1 fixed > > Case #4: > I/O-VM was processing Sockperf requests from clients; CPU-VM was > running a compute-bound task. > Hypervisor is the native Xen 4.5.0 with bug 1 & 2 fixed > > Case #5: > I/O-VM was processing Sockperf requests from clients; CPU-VM was > running a compute-bound task. > Hypervisor is the native Xen 4.5.0 with bug 1 & 2 & 3 fixed > > --------------------------------------- > > > (2) Problem analysis Hey Tony, Thanks for looking at this. These issues in the credit1 algorithm are essentially exactly the reason that I started work on the credit2 scheduler several years ago. We meant credit2 to have replaced credit1 by now, but we ran out of time to test it properly; we're in the process of doing that right now, and are hoping it will be the default scheduler for the 4.8 release. So if I could make two suggestions that would help your effort be more helpful to us: 1. Use cpupools for testing rather than pinning. A lot of the algorithms are designed with the assumption that they have all the cpus to run on, and the credit allocation / priority algorithms fail to work properly when they are only pinned. Cpupools was specifically designed to allow the scheduler algorithms to work as designed with a smaller number of cpus than the system had. 2. Test credit2. :-) One comment about your analysis here... > [Bug2]: In csched_acct() (by default every 30ms), a VCPU stops earning > credits and is removed from the active CPU list(in > __csched_vcpu_acct_stop_locked) if its credit is larger than the upper > bound. Because the domain has only one VCPU and the VM will also be > removed from the active domain list. > > Every 10ms, csched_tick() --> csched_vcpu_acct() --> > __csched_vcpu_acct_start() will be executed and tries to put inactive > VCPUs back to the active list. However, __csched_vcpu_acct_start() > will only put the current VCPU back to the active list. If an > I/O-bound VCPU is not the current VCPU at the csched_tick(), it will > not be put back to the active VCPU list. If so, the I/O-bound VCPU > will likely miss the next credit refill in csched_acct() and can > easily enter the OVER state. As such, the I/O-bound VM will be unable > to be boosted and have very long latency. It takes at least one time > slice (e.g., 30ms) before the I/O VM is activated and starts to > receive credits. > > [Possible Solution] Try to activate any inactive VCPUs back to active > before next credit refill, instead of just the current VCPU. When we stop accounting, we divide the credits in half, so that when it starts out, it should have a reasonable amount of credit (15ms worth). Is this not taking effect for some reason? -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |