[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Mirage networking benchmarks and TCP hang
Hi everyone,I've being using Mirage under Xen the past few days and I came across an intermittent issue which might have to do with the TCP stack implementation. I've created an iperf client and an iperf server based on the sources I found here: https://github.com/mirage/mirage-net/tree/master/direct/tests/iperf_selfI've de-coupled the client from the server, and minimized I/O and per-packet processing between "Net.Flow.write" and "Net.Flow.read" calls, for saturating the network stack's throughput. The sources of the iperf server/client can be found here (I am using mirari): http://www.cl.cam.ac.uk/~dp463/files/iperf.tar.gzI have also written custom fail-safe scripts which automatically, create, configure, install and uninstall mirage Xen kernels (for XenServer XAPI), minimizing the time required to experiment with different kernels (PM me to provide you with those as well) The FIRST problem is that the iperf benchmark most of the times stalls and never completes, especially when I run mirage iperf client as a Xen guest (see EXPERIMENTS A, B). When the iperf client is compiled against unix, the benchmark manages to complete (see EXPERIMENTS C, D), but the speed is not impressive. The SECOND problem/observation, is that the datarates I obtain with mirage are quite low, especially when compared to Linux guests (near an order of magnitude difference). It might be the case that the network stack of mirage requires some manual tuning, or that my code is inefficient. If this is the case, feel free to provide me with your comments. The TCP settings of all linux guests (including dom0) are here: http://www.cl.cam.ac.uk/~dp463/files/linux.tcp.settings.txtAlso, does anyone know which congestion control algorithm is implemented in the Mirage TCP stack? Ubuntu and Debian Linux uses Cubic by default.At the bottom of this email you can find the details of my setup and the benchmark methodology. I also include some benchmark results for various configurations. Thank you, Dimos =============================== *** Configuration Setup *** =============================== *** System Details *** - Machine specs: i7 3770, 4-core (8-threads), 32GB DDR3 1333 RAM *** Ocaml / Mirage ***- Fetched the latest versions of mirage, mirage-fs, mirage-net, mirage-net-direct, mirage-unix, mirari (all 1.0.0), from "opam-repo-dev" - Mirage iperf settings:Total data volume sent during a benchamrk: 8 gigabits (kilo is 1000, in other words 1000000000 bytes, ~640K [packets ) Packet size: 1461bytes Default TCP settings *** Xen / Linux *** - Xen Hypervisor 4.1 (Ubuntu 12.10, XCP, openvswitch, Xen API v1.3) - Xen vCPU allocations: Dom0: pinned on cores 0 and 1 (VCPUs-params:mask=0,2) Dom1: pinned on core 2 (VCPUs-params:mask=4) Dom2: pinned on core 3 (VCPUs-params:mask=6) - All Linux programs (e.g. mirari run --unix, iperf) run with "nice -n -15" *** Networking configuration *** - Using OpenVSwitch v1.4.3. for bridging- Default Linux TCP settings for Linux running on DomUs and Dom0 (see here: http://www.cl.cam.ac.uk/~dp463/files/linux.tcp.settings.txt) - No Jumbo frames for Xen:xe network-param-get uuid=a0378bc6-a9e4-9054-50ff-59d5ab8d2ee8 param-name=MTU --> 1500 - No Jumbo frames for Linux: All interfaces are set with 1500 butes MTU- For the Linux2Linux iperf experiments that run only on Dom0, the client and the server are forced to push traffic through xenbr0 (mtu1500) (not via loopback) - For the Mirage instances that are compiled for unix target, mirari creates a tap interface which I attach to bridge xenbr1. When I run 2 unix instances of mirage, I force them to communicate over 2 distinct tap interfaces, attached on the same bridge (xenbr1). I had to modify a bit mirari as it had "tap0" and "10.0.0.1" values hard-coded. =============================== *** Mirage2Mirage benchmarks *** =============================== ------------------------------------------------------ Experiment A: Xen.DomU( Mirage xen Iperf.Server ) Xen.DomU( Mirage xen Iperf.Client ) ------------------------------------------------------Most of the times (around 7 out of 10) the experiment hangs, and it seems to me that the network stack is block-waiting. When the experiment hangs, the vCPUs utilisation are ~0% for DomUs, and ~100++% for Dom0. Also, I get a mixture of error messages which I attached right under the next paragraph. When the benchmark manages to complete, I get a huge variability in the throughput. Note that Dom0 and DomUs are pinned to exclusive vCPUs / physical cores, so the variabilty in performance can't be due to noise from other processes. Pushing 8 gigabits across the VMs takes anywhere from 6 seconds (~1.35Gbps) to 20seconds (~0.4Gbps). Server: Iperf server: Received connection from 192.168.8.102:18836. RX error 0 RX error 0 RX error 0 RX error 0 RX error 0 Client: Iperf client: Made connection to server. TCP retransmission on timer seq = 1435231548 TCP retransmission on timer seq = 1435231548 TCP retransmission on timer seq = 1435231548 Client: RX: ack (id = 1523) wakener not found valid ids = [ ] RX: ack (id = 655) wakener not found valid ids = [ ] RX: ack (id = 1975) wakener not found valid ids = [ ] ------------------------------------------------------ Experiment B: Xen.Dom0( Mirage unix Iperf.Server ) Xen.DomU( Mirage xen Iperf.client ) ------------------------------------------------------Most of the time the benchmark hangs, with the message "TCP retransmission on timer seq = ...." shown either at the Client, or the Server or both. When the benchamrk completes, 8 Gigabits are transfered in about 8-9 seconds (0.9-1Gbps) ------------------------------------------------------ Experiment C: Xen.DomU( Mirage xen Iperf.Server ) Xen.Dom0( Mirage unix Iperf.client ) ------------------------------------------------------ The benchamark always completes without any issues.The client pushes to the server 8 Gigabits in around 10.2sec (~0.8 Gbps, quite consistent) ------------------------------------------------------ Experiment D: Xen.Dom0( Mirage unix Iperf.Server ) Xen.Dom0( Mirage unix Iperf.client ) ------------------------------------------------------ The benchmark always completes without hanging and/or errors.It takes from 8seconds to 10 seconds to pushe 8 Gigabits from the client to the server (0.8 - 1Gbps , quite consistent) =============================== *** Linux2Linux benchmarks *** =============================== ------------------------------------------------------ Experiment E: Xen.DomU( Debian XenGuest Iperf.Server ) Xen.DomU( Debian XenGuest Iperf.client ) ------------------------------------------------------ Reaches around 9-10 Gbps, with vCpu utilisations ~45% for the client, ~60% for the server and ~90% for Dom0Probably Dom0 is the bottleneck (memory bandwidth or processing horsepower, or both) ------------------------------------------------------ Experiment F: Xen.Dom0( Ubuntu Iperf.Server ) Xen.DomU( Debian XenGuest Iperf.client ) ------------------------------------------------------ Reaches around 20 Gbps, with vCpu utilisations: ~70% for the DomU (client), ~80-85% for the Dom0 (server) ------------------------------------------------------ Experiment G: Xen.DomU( Debian XenGuest Iperf.Server ) Xen.Dom0( Ubuntu Iperf.client ) ------------------------------------------------------ Reaches around 22 Gbps, with vCpu utilisations: ~90% for the DomU (server), ~90% for the Dom0 (client) ------------------------------------------------------ Experiment H: Xen.Dom0( Ubuntu Iperf.Server ) Xen.Dom0( Ubuntu Iperf.client ) ------------------------------------------------------ Reaches around 24.5 Gbps, with vCpu utilisations: ~95% for the Dom0 (both server and client)
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |