[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Package drop between eth0(domU) and vif(dom0)
Hello, TL;DR: frames between 64..256 bytes entering virtual ethX never reaches vifN.X at a rate of 1-10% while forwarding between two virtual interfaces but not if I use a single device. Is it known issue? How to debug further? Now the long version. I use a xen domU as a network router (OpenWrt) doing firewall, NAT and port forward an UDP port to another internal machine (OpenVPN server). Our support team reported that VPN links got high packet loss count (1-10%) while pinging but without a significant user experience effect. The packet loss happens only for some specific packet size and never for others. In other to isolate VPN problems, I built a simple UDP ping service (socat pipe) and a UDP ping client (also socat), which could detect packet loss and change packet size. For reference: server# socat -v PIPE udp-recvfrom:4000,fork client# for size in $(seq 1 500); do i=0; for try in $(seq 50); do echo -n "$(date "+%s %c")"; rec=$({ printf "%-${size}s" "$i"; sleep 1; } | socat - udp:my-internet-ip:4000); if [ "$rec" ]; then echo " " $rec "$(date "+%s %c")"; else echo " lost $i (size $size)"; continue 2; fi; sleep 1 ; : $((i++)); done; echo "No loss (size $size)"; done # sorry for the oneliner haters I ran the UDP ping server in another completely different internal server directly connected to the router (isolating any problem with that server, OpenVPN service, network switching or any external issues). Something like this: client (socat client) -> (eth3) router:4000/udp (DNAT) (eth0) -> internal-server:4000/udp (socat server) I could reproduce the problem when UDP ping payload matched the same size of OpenVPN packet while pinging. So, I tested it changing the UDP payload size from 1 to 500 bytes. The packet loss started at some "magic numbers": udp payload size 1..21, frame size 43..63 bytes: no loss udp payload size 22..214, frame size 64..256 bytes: 1-5% loss udp payload size 215..500, frame size 257..542 bytes: no loss It is consistently reproducible as in 50 UDP pings there was only 1 case of false negative in 22..214 range and three false positive in 43..63 and 257..542 ranges (probably normal network loss) I sniffed both ethX(domU) and vifN.X(dom0). It seems that the frame got into ethX but never appeared in vifN.X. It happened in both ethX devices, while router sends to internal server (eth0) and also while forwarding to client (eth3). If I run the UDP ping server in the router, the problem does not happen. If I forward the package in the router using userland (socat udp-recvfrom:4000,fork udp:internal-server:4000) instead of port forwarding, the problem does not appear. It only happens when I use two different interfaces and kernel-mode only processing (iptables). After I isolated the problem, I can reproduce it with normal ping passing through the router that matches the problematic frame size range. The domU that have the problem is using kernel 4.14.63 (OpenWrt 18.06) with no xen-related patches. The dom0 is a SLES12SP4 running 4.12.14-95.6-default on xen 4.11.1_02-2.3. I'll try to change domU/dom0/xen versions in order to isolate futher the problem. However, I guess that didn't happen with SLES12SP3 (xen 4.9.3_03-3.47) Is this a known issue already fixed in a xen newer version or kernel release? I'm using Xen for some year but I have no experience on how to debug Xen internals. Regards, --- Luiz Angelo Daros de Luca luizluca@xxxxxxxxx _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |