[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] pv_ops kernel and network problems (checksum offloading?)



Hi list,

I'm experiencing some very strange network problems when using a masquerading 
router domU with pv_ops kernels.
First of all here is some ASCII art explaining my network configuration:

                  +---------------------+
               +--|-eth0   domU2   eth1-|-----+
               |  +---------------------+     |
               |                              |
               |  +---------------------+     |
               |  |        domU1   eth1-|--+  |
               |  +---------------------+  |  |
               |                           |  |
       +-------|---------------------------|--|--------+
       |       | vif2.0             vif1.1 |  | vif2.1 |
 Internet      |                           |  |        |
 <-----|----- brexternal     dom0        brinternal    |
       | eth0                                          |
       +-----------------------------------------------+

domU1 intentionally has no internet connection and domU2 acts as masquerading 
router for the internal network. 
Configuration is very very basic, on domU2 I've issued the following commands:
# echo 1 > /proc/sys/net/ipv4/ip_forward
# iptables -A POSTROUTING -t nat -s <internal/net> -j MASQUERADE

Now the problems:

1. ICMP
When I try to ping an internet host from domU1, dom0 kernel logs the following 
message for every ICMP echo request packet domU1 tries to send:
--- cut ---
Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet
--- cut ---
IP protocol 1 is ICMP, so this matches. Using tcpdump I've been able to follow 
the ping packets their way: domU1-eth1 -> vif1.1 -> brinternal -> vif2.1 -> 
domU2-eth1 -> domU2-eth0
The packet never reaches vif2.0 - it gets dropped somewhere between (according 
to the message I see, I would expect dom0 kernel to be the problem)
Issuing the same ping command directly on domU2 works without any problems. 

2. TCP
When I try to connect to an internet host by TCP from domU1 I see a very very 
odd behavior:
The TCP SYN packet leaves dom0 on eth0 as desired and reaches the remote host. 
But the remote host never responds with a SYN/ACK packet, so I took a deeper 
look with tcpdump and Wireshark: The packet *seems* to leave dom0 eth0 with 
correct TCP checksum but enters the remote host with TCP checksum ALWAYS set 
to 0xeeee - which is wrong of course, so the remote host drops the SYN packet. 
But I'm very sure the packet leaves dom0 with wrong checksum. 
Next I remembered the early XEN 3 days where we have been forced to use 
ethtool to disable checksum offloading everywhere, so I did the same: I used 
"ethtool -K <interface> tx off" for EVERY interface in the communication path 
(domU1-eth1, vif1.1, brinternal, vif2.1, domU2-eth1, domU2-eth0, vif2.0, 
brexternal and dom0-eth0) but the only effect this gives is that now I see the 
packet leaving dom0 at eth0 with a wrong checksum (0xeeee). 
I have no problem connecting to this host directly from domU2. 

My system configuration:
Debian lenny amd64 everywhere
XEN 3.4.2 (Debian unstable built for lenny)
dom0 kernel: pv_ops from Jeremies tree (changeset 
8735edb4a976105fd29c97c00c6d14760537e4ee)
domU kernel: pv_ops 2.6.29-2 (from Debian unstable) (would like to go to newer 
kernel, but there's that other nasty bug :))

This looks like some sort of checksum offloading bug in pv_ops kernel tree 
that kicks in when using a domU to route (and masquerade) other traffic.

Any ideas?

Regards,
Markus

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.