[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] mirage-tcpip and jumbo frames

On Thu, Jan 19, 2017 at 2:40 PM, Mindy <mindy@xxxxxxxxxxxxxxxxxxx> wrote:
On 01/19/2017 04:14 AM, Anil Madhavapeddy wrote:

On 19 Jan 2017, at 10:00, David Scott <scott.dj@xxxxxxxxx> wrote:

I'm trying to increase the performance a program which uses the mirage-tcpip stack (specifically vpnkit[1] running on Windows). I noticed the total CPU overhead in `top` was higher than I expected so I attempted to reduce the overhead per byte by enabling jumbo frames. I bumped the MTU of the ethernet link however this was not enough -- mirage-tcpip was still sending frames of size ~1500 bytes. I tracked the problem down to the [max_mss](https://github.com/mirage/mirage-tcpip/blob/756db428db2346a7b7461805cf233631b8f61a1e/lib/tcp/window.ml#L62) -- when I manually bumped this and recompiled, I got larger frames and my TCP throughput increased from 500Mbit/sec to 600Mbit/sec (there are other overheads that also need addressing)

So my questions is: how should this be done properly? Should the TCP layer query the maximum IP datagram size (derived from the underlying ethernet MTU)? Or is something more complicated needed?
That sounds right -- one missing feature is that we don't have Path MTU discovery in the stack, and so can only select on the basis of the immediate MTU (which may be larger than some intermediate hop, causing fragmentation on the wire).

I've thought about this a bit recently (since https://github.com/mirage/mirage/issues/622#issuecomment-254513280) but have lacked the time and focus to improve the situation.  It's a bit worse than the comment above implies, because we currently have no concept of an MTU at all in the Ethernet implementation used by mirage-tcpip's `direct` stack.

An important first step would be adding any facility for setting (on `connect`, presumably) the MTU in the Ethernet layer, and adding a function for querying that information to the ETHIF module type so higher layers can rely on it.  Right now there's no mechanism for discovering that the packet to be sent is larger than our own MTU, let alone one further along the path.

Ah, I hadn't spotted that an MTU accessor function is missing! I think this probably explains why the MSS value is hardcoded :)

I had a go at adding a simple `mtu: t -> int` accessor to both ethernet and IPv* and then patched TCP to compute the MSS from the MTU of the layer beneath. As you suggested I added a `connect` parameter to the ethernet layer:

Let me know what you think!

(There's no rush on this from my point of view -- I imagine things are really busy with the release. If the shape of the interface is ok then I might proceed and base further speculative work on it in branches)



Path MTU discovery beyond the 0th hop would also involve an assembly of modules that is a bit more intelligent with regard to ICMP messages than the current Mirage_stack_lwt.V4, but I don't think anything required for that is missing in the set of code we're assembling for MirageOS 3 - probably the right knobs aren't exposed in the TCP module type, but that's the only thing that comes to mind.

I'd love to see contributions in this area :)

Perhaps that's something (along with IPv6) for the Marrakesh hackathon?

Related to this, if the ethernet link actually has a small MTU, presumably TCP will emit large 1460 segments -- is this bad?
Yes, that's bad as it'll cause IP-level fragmentation, which we probably want to avoid -- I believe IPv4's minimum transmission unit is 68.

Currently it results in packets going to /dev/null, which may or may not be preferable :) you might be interested in @polvorin's IPv4 fragmentation PR at https://github.com/mirage/mirage-tcpip/pull/243 , which I got distracted from with release managing but is relevant to your interests.


MirageOS-devel mailing list

Dave Scott
MirageOS-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.