[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Large Network Traffic brings the server down


  • To: xen-users@xxxxxxxxxxxxxxxxxxx
  • From: Pepe Barbe <elventear@xxxxxxxxx>
  • Date: Wed, 20 Aug 2008 15:41:49 -0500
  • Delivery-date: Wed, 20 Aug 2008 13:43:18 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:content-type:content-transfer-encoding :mime-version:subject:date:x-mailer; b=Yr1l0Z1p9D7EpMUOeOEC7pflvYtFVopMEtdVVEovUEL2UVtNAirXV++Mz5Gm6OO9x0 sBjRi+yurmqBCYPFJB7cs2/dlfLk4EGTPORNHj3HX6IA/jgp5F7ezVQ1cu4O+8qe0QPN 61wA+KlpYZWeRhZ3DYMaWgf4RfUk4tEqfbeCM=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Hello everyone,

I am having some problems serious problems with our Xen setup. I noticed the problem when doing an rsync from the dom0 to the WAN. Before I start explaining anything else, let me post a diagram of the my Xen setup topology to make things more explanatory. The letters before the NIC name mean:

pb: Bridge+Physical IF
v: Virtual IF
p: Physical IF

                        .-----.
                        . LAN .
                        '-----'
                           ^
                           |
                           |
.----------------. | .----------------. .----------------. | dom0 | | | domU | | domU | |----------------| | |----------------| |----------------| | | | | router | | DMZ | | | | | | | | | pb:eth0----v:eth0 v:eth1 ------- b:eth0 | | | | | | | | '----------------' | '-----p:eth2 ----' '----------------'
                           |                  |
                           |                  |
                           |                  |
        .----------------. |                  v
        |      domU      | |               .-----.
        |----------------| |               . WAN .
        | Local Server   | |               '-----'
        |               v:eth0
        |                |
        |                |
        '----------------'

Basically what happens is that the link between the dom0 and the domU:router dies; I've noticed that as we start the transfer, the ping delay time from the dom0 to the Wan starts to increase until a point of no return is reached; after which the traffic stops flowing all together. If I kill my large transfer before this point is reached, the link recovers and everything is back to normal.

I already had disabled TCP TX checksumming on all the virtual interfaces so it seems the problem is not related to this, although it behaves similarly as what other people have described.

When the link between the dom0 and domU:router is effectively killed, I attached to the router's console, via Xen, and verified it is up and still able to reach the WAN, which it was.

Other things I've noticed is that when the dom0 link dies is that it starts behaving erratically. I tried to destroy and create the domUs and some processes went into uninterruptible sleep, making it impossible to do anything with the server.

This server has been running since April/08 on Ubuntu 8.04. It has been running without mayor hiccups, after the main linux-xen fixes had been officially released by Ubuntu. I've done rsyncs not so long ago and the only thing that has changed is that our DSL was upgraded from 1.5 Mbps to 7 Mbps; I don't know if the speed change could be big enough to trigger this issue.

So, after the preamble, my questions are: Is this a known issue? Any workarounds? If not, any ideas on what to do to troubleshoot it?

Thanks,
Pepe

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.