Xen project Mailing List

[Xen-devel] long tail latency caused by rate-limit in Xen credit2

Date: Tue, 13 Jun 2017 14:59:22 -0500

Delivery-date: Tue, 13 Jun 2017 20:00:19 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi all, When I was executing the latency-sensitive applications in VMs on the latest Xen, I found the rate limit will cause the long tail latency for VMs sharing CPU with other VMs. (1) Problem description ------------Description------------ My test environment is as follows: Hypervisor(Xen 4.8.1), scheduler(credit2), Dom 0(Linux 4.10 ), Dom U(Linux 4.10). Environment setup: We created two 1-vCPU, 4GB-memory VMs and pinned them onto one physical CPU core. One VM(denoted as I/O-VM) ran Sockperf server program; the other VM ran a compute-bound task, e.g., SPECCPU 2006 or simply a loop(denoted as CPU-VM). A client on another physical machine sent requests to the I/O-VM. Here is the result when the I/O-VM running alone: sockperf: ====> avg-lat= 62.707 (std-dev=1.370) sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0 sockperf: Summary: Latency is 62.707 usec sockperf: Total 998 observations; each percentile contains 9.98 observations sockperf: ---> <MAX> observation = 71.267 sockperf: ---> percentile 99.999 = 71.267 sockperf: ---> percentile 99.990 = 71.267 sockperf: ---> percentile 99.900 = 70.965 sockperf: ---> percentile 99.000 = 67.707 sockperf: ---> percentile 90.000 = 64.226 sockperf: ---> percentile 75.000 = 63.308 sockperf: ---> percentile 50.000 = 62.476 sockperf: ---> percentile 25.000 = 61.757 sockperf: ---> <MIN> observation = 60.067 Here is the result when the I/O-VM sharing the same CPU with CPU-VM: sockperf: ====> avg-lat=315.456 (std-dev=155.568) sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0 sockperf: Summary: Latency is 315.456 usec sockperf: Total 998 observations; each percentile contains 9.98 observations sockperf: ---> <MAX> observation = 1573.300 sockperf: ---> percentile 99.999 = 1573.300 sockperf: ---> percentile 99.990 = 1573.300 sockperf: ---> percentile 99.900 = 586.523 sockperf: ---> percentile 99.000 = 570.727 sockperf: ---> percentile 90.000 = 523.345 sockperf: ---> percentile 75.000 = 447.037 sockperf: ---> percentile 50.000 = 314.435 sockperf: ---> percentile 25.000 = 182.011 sockperf: ---> <MIN> observation = 59.997 --------------------------------------- (2) Problem analysis ------------Analysis---------------- I read the source code in Xen credit2 scheduler. The vCPU priority used in credit1 such as OVER, UNDER, BOOST, is all removed and all the vCPUs are just ordered by their credit. I traced vCPU credit and the I/O-VM vCPU credit is always larger than the CPU-VM credit. So the order of I/O-VM vCPU is always ahead of the CPU-VM vCPU. Next, I traced the time gap between vCPU wake and vCPU scheduler function. I found that if the I/O-VM run alone, the time gap is about 3,000ns; however, if the I/O-VM co-run with CPU-VM on the same core, the time gap enlarged to 1,000,000ns and that happened in every vCPU scheduling. That reminded me the ratelimit in the Xen credit scheduler. The default ratelimit in Xen is 1000us. As I modified the the ratelimit to 100us in the terminal: $ sudo /usr/local/sbin/xl sched-credit2 -s -r 100 The average latency is reduced from 300+us to 200+us and the tail latency is also reduced. sockperf: ====> avg-lat=215.092 (std-dev=84.781) sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0 sockperf: Summary: Latency is 215.092 usec sockperf: Total 1998 observations; each percentile contains 19.98 observations sockperf: ---> <MAX> observation = 1295.318 sockperf: ---> percentile 99.999 = 1295.318 sockperf: ---> percentile 99.990 = 1295.318 sockperf: ---> percentile 99.900 = 356.320 sockperf: ---> percentile 99.000 = 345.511 sockperf: ---> percentile 90.000 = 326.780 sockperf: ---> percentile 75.000 = 290.090 sockperf: ---> percentile 50.000 = 210.714 sockperf: ---> percentile 25.000 = 142.875 sockperf: ---> <MIN> observation = 70.533 However, the minimum value of ratelimit is 100us which means there still exists the gap between the mix running VMs case and the running alone VM case. (P.S. the valid range of ratelimit is from 100 to 500000us). To mitigate the latency, the users have to run the I/O VMs on a dedicated core but that will waste lots of CPU resources on the other hand. As an experiment test, I modified the Xen source code to allow the ratelimit could be set as 0. As below, here is the result when I set the ratelimit to 0. Both average latency and tail latency when co-running with CPU-VMs is at the same magnitude and range of that in I/O-VM running alone. sockperf: ====> avg-lat= 71.766 (std-dev=1.618) sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0 sockperf: Summary: Latency is 71.766 usec sockperf: Total 1999 observations; each percentile contains 19.99 observations sockperf: ---> <MAX> observation = 99.257 sockperf: ---> percentile 99.999 = 99.257 sockperf: ---> percentile 99.990 = 99.257 sockperf: ---> percentile 99.900 = 84.155 sockperf: ---> percentile 99.000 = 78.873 sockperf: ---> percentile 90.000 = 73.920 sockperf: ---> percentile 75.000 = 72.546 sockperf: ---> percentile 50.000 = 71.458 sockperf: ---> percentile 25.000 = 70.518 sockperf: ---> <MIN> observation = 63.150 Similar problem could also be found in credit1 scheduler. Thanks. --------------------------------------- -- ********************************** > Tony Suo > Computer Science, University of Texas at Arlington ********************************** _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.