[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] mirage-tcpip and jumbo frames

On 01/19/2017 04:14 AM, Anil Madhavapeddy wrote:

On 19 Jan 2017, at 10:00, David Scott <scott.dj@xxxxxxxxx> wrote:

I'm trying to increase the performance a program which uses the mirage-tcpip 
stack (specifically vpnkit[1] running on Windows). I noticed the total CPU 
overhead in `top` was higher than I expected so I attempted to reduce the 
overhead per byte by enabling jumbo frames. I bumped the MTU of the ethernet 
link however this was not enough -- mirage-tcpip was still sending frames of 
size ~1500 bytes. I tracked the problem down to the 
 -- when I manually bumped this and recompiled, I got larger frames and my TCP 
throughput increased from 500Mbit/sec to 600Mbit/sec (there are other overheads 
that also need addressing)

So my questions is: how should this be done properly? Should the TCP layer 
query the maximum IP datagram size (derived from the underlying ethernet MTU)? 
Or is something more complicated needed?
That sounds right -- one missing feature is that we don't have Path MTU 
discovery in the stack, and so can only select on the basis of the immediate 
MTU (which may be larger than some intermediate hop, causing fragmentation on 
the wire).

I've thought about this a bit recently (since https://github.com/mirage/mirage/issues/622#issuecomment-254513280) but have lacked the time and focus to improve the situation. It's a bit worse than the comment above implies, because we currently have no concept of an MTU at all in the Ethernet implementation used by mirage-tcpip's `direct` stack.

An important first step would be adding any facility for setting (on `connect`, presumably) the MTU in the Ethernet layer, and adding a function for querying that information to the ETHIF module type so higher layers can rely on it. Right now there's no mechanism for discovering that the packet to be sent is larger than our own MTU, let alone one further along the path.

Path MTU discovery beyond the 0th hop would also involve an assembly of modules that is a bit more intelligent with regard to ICMP messages than the current Mirage_stack_lwt.V4, but I don't think anything required for that is missing in the set of code we're assembling for MirageOS 3 - probably the right knobs aren't exposed in the TCP module type, but that's the only thing that comes to mind.

I'd love to see contributions in this area :)

Perhaps that's something (along with IPv6) for the Marrakesh hackathon?

Related to this, if the ethernet link actually has a small MTU, presumably TCP 
will emit large 1460 segments -- is this bad?
Yes, that's bad as it'll cause IP-level fragmentation, which we probably want 
to avoid -- I believe IPv4's minimum transmission unit is 68.

Currently it results in packets going to /dev/null, which may or may not be preferable :) you might be interested in @polvorin's IPv4 fragmentation PR at https://github.com/mirage/mirage-tcpip/pull/243 , which I got distracted from with release managing but is relevant to your interests.


MirageOS-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.