[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] netif.h clarifications



Hello,

While trying to solve a FreeBSD netfront bug [0] I came across a couple 
of netif.h dark spots that I think should be documented in the netif.h 
header. I'm willing to make those changes, but I want to make sure my 
understanding is right.

Regarding checksum offloading, I had a hard time figuring out what the 
different flags actually mean:

/* Packet data has been validated against protocol checksum. */
#define _NETRXF_data_validated (0)
#define  NETRXF_data_validated (1U<<_NETRXF_data_validated)

/* Protocol checksum field is blank in the packet (hardware offload)? */
#define _NETRXF_csum_blank     (1)
#define  NETRXF_csum_blank     (1U<<_NETRXF_csum_blank)

(Same applies to the TX flags, I'm not copying them there because they are 
the same)

First of all, I assume "protocol" here refers to Layer 3 and Layer 4 
protocol, so that would be IP and TCP/UDP/SCTP checksum offloading? In any 
case this needs clarification and proper wording.

Then, I have some questions regarding the meaning of the flags themselves 
and the content of the checksum field in all the possible scenarios.

On RX path:

 - NETRXF_data_validated only: data has been validated, but what's the state 
   of the checksum field itself? If the data is validated again, would it 
   match against the checksum?
 - NETRXF_csum_blank only: I don't think this makes much sense, data is in 
   unknown state and checksum is not present, so there's no way to validate 
   it. Packet should be dropped?
 - NETRXF_data_validated | NETRXF_csum_blank: this combination seems to be 
   the one that makes more sense to me, data is valid, but checksum is not 
   there. This matches what some real NICs already do, that is to provide 
   the result of the checksum check _without_ actually providing the 
   checksum itself on the RX path.

On TX path:

 - NETTXF_data_validated only: I don't think this makes any sense, data is 
   always valid from the senders point of view.
 - NETTXF_csum_blank only: checksum calculation offload, it should be 
   performed by the other end.
 - NETTXF_data_validated | NETTXF_csum_blank: again, I don't think it makes 
   much sense, data is always valid from the senders point of view, or else 
   why bother sending it?

So it looks to me like we could get away with just two flags, one on the RX 
side that signals that the packet doesn't have a checksum but that the 
checksum validation has already been performed, and another one on the TX 
side to signal that the packet doesn't have a calculated checksum 
(typical checksum offload).

And then I've also seen some issues with TSO/LRO (GSO in Linux terminology) 
when using packet forwarding inside of a FreeBSD DomU. For example in the 
following scenario:

                                   +
                                   |
   +---------+           +--------------------+           +----------+
   |         |A         B|       router       |C         D|          |
   | Guest 1 +-----------+         +          +-----------+ Guest 2  |
   |         |  bridge0  |         |          |  bridge1  |          |
   +---------+           +--------------------+           +----------+
   172.16.1.67          172.16.1.66|   10.0.1.1           10.0.1.2
                                   |
             +--------------------------------------------->
              ssh 10.0.1.2         |
                                   |
                                   |
                                   |
                                   +

All those VMs are inside of the same host, and one of them acts as a gateway 
between them because they are on two different subnets. In this case I'm 
seeing issues because even though I disable TSO/LRO on the "router" at 
runtime, the backend doesn't watch the xenstore feature flag, and never 
disables it from the vif on the Dom0 bridge. This causes LRO packets 
(non-fragmented) to be received at point 'C', and then when the gateway 
tries to inject them into the other NIC it fails because the size is greater 
than the MTU, and the "no fragment" bit is set.

How does Linux deal with this situation? Does it simply ignore the no 
fragment flag and fragments the packet? Does it simply inject the packet to 
the other end ignoring the MTU and propagating the GSO flag?

Roger.

[0] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=188261

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.