[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Re: [Xen-devel] Ethernet MTU


  • To: "James Harper" <james.harper@xxxxxxxxxxxxxxxx>
  • From: "Molle Bestefich" <molle.bestefich@xxxxxxxxx>
  • Date: Wed, 16 Aug 2006 13:58:58 +0200
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, xen-users@xxxxxxxxxxxxxxxxxxx, Sylvain Coutant <sco@xxxxxxxxxx>
  • Delivery-date: Wed, 16 Aug 2006 04:59:44 -0700
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=bu8EFSC3aqi+cauvWoaGIv7Pvv2BYYZniff6nT598O6CwH+HSKLGZMLsJVhRserzlzdr8mlt9QEZ6tSL5utyOtbrifMZjn1R7K40jhFABf2wvll+bBlnJcvYm0Iai+L3YFyQgRctyapcpfcLtlt8utY0FOFrB8JAIK/gYbO9qjU=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

James Harper wrote:
Consider this broken implementation:

A1----B1:B2-----C1:C2----D1

A and D are hosts
B and C are L2 routers (eg Linux bridges or Ethernet switches)

A1 is a normal Ethernet interface with an mtu of 1500
B1 is a normal Ethernet interface with an mtu of 1500
B2 is an 802.1q tagged interface with an mtu of 1496
C1 is an 802.1q tagged interface with an mtu of 1496
C2 is a normal Ethernet interface with an mtu of 1500
D1 is a normal Ethernet interface with an mtu of 1500

B1 is bridged to B2 vlan 4
C1 vlan 4 is bridged to C2

If A sends a packet with an MTU of 1500 out on the A1 interface, it is
received by B on B1, but then can't be forwarded via B2 because it won't
fit. There is no concept of fragmentation at the layer 2 level, that's a
layer 3 (IP) thing, so the packet is dropped (and presumably B would log
an error of some sort).

Darn, you're absolutely right.

Besides from solution A which you say (I don't think that's the case,
but OK) is already in place:
A) Bridge discards frame, logs error message.

Another compelling solution would be:

B) Use another type of switch/bridge, call it "intelligent switch" or
"VLAN switch" or whatever, which can inspect into the IP layer of
frames carrying IP packets and fragment them accordingly.

Does that exist?

A much much better solution than A (slightly worse than C perhaps),
while still keeping it simple, would be:

C) Fix the bridging code in Linux to reject interfaces which does not
match MTU-wise with the interfaces already in the bridge.

Solution C ensures that the administrator knows that there's a problem
beforehand, not when/if it randomly occurs.  Much better than A.

In the case of Xen, for C to be efficient, Xen should automatically
force the MTU of one end of the virtual wire to be the same as the MTU
on the interface at the other end, otherwise the point is moot, since
the bridge would think that all is OK with it's end of the virtual
wire having eg. MTU 1496, but it could still get oversized frames
because the other end has 1500.

(For real wire situations, solution A would still need to be in place,
since we can't synchronize those automatically..)


For 802.1q, Linux normally does some trickery where the MTU is bumped up
by 4 when a tag is involved, but as far as userspace is concerned the
MTU is always 1500 whether it's a native or tagged packet.

So when configuring a VLAN on an interface, you say that Linux
automatically adds 4 to the MTU of the underlying physical interface?

Since Linux has no way of synchronizing the MTU with the interface at
the other end of the wire, that's really stupid.  It should just lower
the MTU of the virtual interface by 4 bytes instead.


If you ever have to manually fiddle with MTU's then something is
horribly broken.

Something definitely is...
I'd say probably more than one thing...


And yes, dammit, I've just realised too that i've been doing a reply-all
and crossposting. I guess I'm stuck with it now - I haven't subscribed
to xen-users. Sorry.

I contemplated stripping xen-devel, but that seemed wrong too.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.