[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] large packet support in netfront driver and guest network throughput


  • To: Wei Liu <wei.liu2@xxxxxxxxxx>
  • From: Anirban Chakraborty <abchak@xxxxxxxxxxx>
  • Date: Fri, 13 Sep 2013 17:09:48 +0000
  • Accept-language: en-US
  • Cc: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Fri, 13 Sep 2013 17:10:05 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHOr+Dtab7v4C01XU6hPj6YULek5ZnDjVUAgABbAoA=
  • Thread-topic: [Xen-devel] large packet support in netfront driver and guest network throughput

On Sep 13, 2013, at 4:44 AM, Wei Liu <wei.liu2@xxxxxxxxxx> wrote:

> On Thu, Sep 12, 2013 at 05:53:02PM +0000, Anirban Chakraborty wrote:
>> Hi All,
>> 
>> I am sure this has been answered somewhere in the list in the past, but I 
>> can't find it. I was wondering if the linux guest netfront driver has GRO 
>> support in it. tcpdump shows packets coming in with 1500 bytes, although the 
>> eth0 in dom0 and the vif corresponding to the linux guest in dom0 is showing 
>> that they receive large packet:
>> 
>> In dom0:
>> eth0      Link encap:Ethernet  HWaddr 90:E2:BA:3A:B1:A4  
>>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
>> 17:38:25.155373 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto 
>> TCP (6), length 29012)
>>    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 
>> 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960
>> 
>> vif4.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
>>          UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> tcpdump -i vif4.0 -nnvv -s 1500 src 10.84.20.214
>> 17:38:25.156364 IP (tos 0x0, ttl 64, id 54607, offset 0, flags [DF], proto 
>> TCP (6), length 29012)
>>    10.84.20.214.51041 > 10.84.20.213.5001: Flags [.], seq 276592:305552, ack 
>> 1, win 229, options [nop,nop,TS val 65594025 ecr 65569225], length 28960
>> 
>> 
>> In the guest:
>> eth0      Link encap:Ethernet  HWaddr CA:FD:DE:AB:E1:E4  
>>          inet addr:10.84.20.213  Bcast:10.84.20.255  Mask:255.255.255.0
>>          inet6 addr: fe80::c8fd:deff:feab:e1e4/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> tcpdump -i eth0 -nnvv -s 1500 src 10.84.20.214
>> 10:38:25.071418 IP (tos 0x0, ttl 64, id 15074, offset 0, flags [DF], proto 
>> TCP (6), length 1500)
>>    10.84.20.214.51040 > 10.84.20.213.5001: Flags [.], seq 17400:18848, ack 
>> 1, win 229, options [nop,nop,TS val 65594013 ecr 65569213], length 1448
>> 
>> Is the packet on transfer from netback to net front is segmented into MTU 
>> size? Is GRO not supported in the guest?
> 
> Here is what I see in the guest, iperf server running in guest and iperf
> client running in Dom0. Tcpdump runs with the rune you provided.
> 
> 10.80.238.213.38895 > 10.80.239.197.5001: Flags [.], seq
> 5806480:5818064, ack 1, win 229, options [nop,nop,TS val 21968973 ecr
> 21832969], length 11584
> 
> This is a upstream kernel. The throughput from Dom0 to DomU is ~7.2Gb/s.

Thanks for your reply. The tcpdump was captured on dom0 of the guest [at both 
vif and the physical interfaces] , i.e. on the receive path of the server. 
iperf server was running on the guest (10.84.20.213) and the client was at 
another guest (on a different server) with IP 10.84.20.214. The traffic was 
between two guests, not between dom0 and the guest.

> 
>> 
>> I am seeing extremely low throughput on a 10Gb/s link. Two linux guests 
>> (Centos 6.4 64bit, 4 VCPU and 4GB of memory) are running on two different 
>> XenServer 6.1s and iperf session between them shows at most 3.2 Gbps. 
> 
> XenServer might use different Dom0 kernel with their own tuning. You can
> also try to contact XenServer support for better idea?
> 

XenServer 6.1 is running 2.6.32.43 kernel. Since the issue is in netfront 
driver, as it appears from the tcpdump, thats why I thought I post it here. 
Note that checksum offloads of the interfaces (virtual and physical) were not 
even touched, the default setting (which was set to on) was used.

> In general, off-host communication can be affected by various things. It
> would be quite useful to identify the bottleneck first.
> 
> Try to run:
> 1. Dom0 to Dom0 iperf (or you workload)
> 2. Dom0 to DomU iperf
> 3. DomU to Dom0 iperf

I tried dom0 to dom0 and I got 9.4 Gbps, which is what I expected (with GRO 
turned on in the physical interface). However, when I run guest to guest, 
things fall off. Is large packet not supported in netfront? I thought 
otherwise. I looked at the code and I do not see any call to 
napi_gro_receive(), rather it is using netif_receive_skb(). netback seems to be 
sending GSO packets to the netfront, but it is being segmented to 1500 byte (as 
it appears from the tcpdump).

> 
> In order to get line rate, you need to at least get line rate from Dom0
> to Dom0 IMHO. 10G/s line rate from guest to guest has not yet been
> achieved at the moment…

What is the current number, without VCPU pinning etc. for 1500 byte MTU? I am 
getting 2.2-3.2 Gbps for 4VCPU guest with 4GB of memory. It is the only vm 
running on that server without any other traffic.

-Anirban



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.