[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] long tail latency caused by rate-limit in Xen credit2



Hi all,

When I was executing the latency-sensitive applications in VMs on the
latest Xen,
I found the rate limit will cause the long tail latency for VMs
sharing CPU with other VMs.


(1) Problem description

------------Description------------
My test environment is as follows: Hypervisor(Xen 4.8.1),
scheduler(credit2), Dom 0(Linux 4.10
), Dom U(Linux 4.10).

Environment setup:
We created two 1-vCPU, 4GB-memory VMs and pinned them onto one
physical CPU core. One VM(denoted as I/O-VM) ran Sockperf server
program; the other VM ran a compute-bound task, e.g., SPECCPU 2006 or
simply a loop(denoted as CPU-VM). A client on another physical machine
sent requests to the I/O-VM.

Here is the result when the I/O-VM running alone:

sockperf: ====> avg-lat= 62.707 (std-dev=1.370)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 62.707 usec
sockperf: Total 998 observations; each percentile contains 9.98 observations
sockperf: ---> <MAX> observation =   71.267
sockperf: ---> percentile 99.999 =   71.267
sockperf: ---> percentile 99.990 =   71.267
sockperf: ---> percentile 99.900 =   70.965
sockperf: ---> percentile 99.000 =   67.707
sockperf: ---> percentile 90.000 =   64.226
sockperf: ---> percentile 75.000 =   63.308
sockperf: ---> percentile 50.000 =   62.476
sockperf: ---> percentile 25.000 =   61.757
sockperf: ---> <MIN> observation =   60.067


Here is the result when the I/O-VM sharing the same CPU with CPU-VM:

sockperf: ====> avg-lat=315.456 (std-dev=155.568)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 315.456 usec
sockperf: Total 998 observations; each percentile contains 9.98 observations
sockperf: ---> <MAX> observation = 1573.300
sockperf: ---> percentile 99.999 = 1573.300
sockperf: ---> percentile 99.990 = 1573.300
sockperf: ---> percentile 99.900 =  586.523
sockperf: ---> percentile 99.000 =  570.727
sockperf: ---> percentile 90.000 =  523.345
sockperf: ---> percentile 75.000 =  447.037
sockperf: ---> percentile 50.000 =  314.435
sockperf: ---> percentile 25.000 =  182.011
sockperf: ---> <MIN> observation =   59.997

---------------------------------------


(2) Problem analysis

------------Analysis----------------
I read the source code in Xen credit2 scheduler. The vCPU priority
used in credit1 such as OVER, UNDER, BOOST, is all removed and all the
vCPUs are just ordered by their credit. I traced vCPU credit and the
I/O-VM vCPU credit is always larger than the CPU-VM credit. So the
order of I/O-VM vCPU is always ahead of the CPU-VM vCPU.

Next, I traced the time gap between vCPU wake and vCPU scheduler
function. I found that if the I/O-VM run alone, the time gap is about
3,000ns; however, if the I/O-VM co-run with CPU-VM on the same core,
the time gap enlarged to 1,000,000ns and that happened in every vCPU
scheduling. That reminded me the ratelimit in the Xen credit
scheduler. The default ratelimit in Xen is 1000us.

As I modified the the ratelimit to 100us in the terminal:
$ sudo /usr/local/sbin/xl  sched-credit2 -s -r 100

The average latency is reduced from 300+us to 200+us and the tail
latency is also reduced.

sockperf: ====> avg-lat=215.092 (std-dev=84.781)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 215.092 usec
sockperf: Total 1998 observations; each percentile contains 19.98 observations
sockperf: ---> <MAX> observation = 1295.318
sockperf: ---> percentile 99.999 = 1295.318
sockperf: ---> percentile 99.990 = 1295.318
sockperf: ---> percentile 99.900 =  356.320
sockperf: ---> percentile 99.000 =  345.511
sockperf: ---> percentile 90.000 =  326.780
sockperf: ---> percentile 75.000 =  290.090
sockperf: ---> percentile 50.000 =  210.714
sockperf: ---> percentile 25.000 =  142.875
sockperf: ---> <MIN> observation =   70.533


However, the minimum value of ratelimit is 100us which means there
still exists the gap between the mix running VMs case and the running
alone VM case. (P.S. the valid range of ratelimit is from 100 to
500000us). To mitigate the latency, the users have to run the I/O VMs
on a dedicated core but that will waste lots of CPU resources on the
other hand.

As an experiment test, I modified the Xen source code to allow the
ratelimit could be set as 0. As below, here is the result when I set
the ratelimit to 0. Both average latency and tail latency when
co-running with CPU-VMs is at the same magnitude and range of that in
I/O-VM running alone.


sockperf: ====> avg-lat= 71.766 (std-dev=1.618)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 71.766 usec
sockperf: Total 1999 observations; each percentile contains 19.99 observations
sockperf: ---> <MAX> observation =  99.257
sockperf: ---> percentile 99.999 =  99.257
sockperf: ---> percentile 99.990 =  99.257
sockperf: ---> percentile 99.900 =   84.155
sockperf: ---> percentile 99.000 =   78.873
sockperf: ---> percentile 90.000 =   73.920
sockperf: ---> percentile 75.000 =   72.546
sockperf: ---> percentile 50.000 =   71.458
sockperf: ---> percentile 25.000 =   70.518
sockperf: ---> <MIN> observation =   63.150


Similar problem could also be found in credit1 scheduler.

Thanks.
---------------------------------------


-- 

**********************************
> Tony Suo
> Computer Science, University of Texas at Arlington
**********************************

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.