[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] domU network fails under load - vif breaks



Hello,

I am running Xen 3.2.1 under a Gentoo 2.6.21-xen kernel. I have four domains running under a single dom0, two hvm: Windows XP and Windows 2k3, and two pv: a Gentoo 2.6.25 kernel, and a Ubuntu 2.6.24 kernel. I have also experienced the same behaviour under a 2.6.21-xen gentoo kernel.

All the domU networks are bridged to a single 1gb nic, and I have tried an alternative physical nic. There is very little load on this nic - this is a test environment.

At a certain point that I have not established exactly, the network load takes out the pv network. For example, if I initiate a bittorrent session in a pv domU, I get a slow build up of network load, and then connectivity is lost to both of the pv domU's. If I console into them, they cannot ping outside the network, but they can ping their own interfaces. A tcpdump on the dom0 physical shows no traffic. However, during all this, the hvm domains are able to use their network connections without issues. A shutdown of the broken domU doesn't work, as they have nfs shares loaded and it hangs on the nfs unmount, but I suspect that without this they would shutdown cleanly. In any case, I have to destroy them. If I attempt to recreate, I get:

Error: Device 0 (vif) could not be connected. Hotplug scripts not working.

The xend log only shows:
   ...
[2008-08-31 19:14:11 5531] DEBUG (DevController:595) hotplugStatusCallback /local/domain/0/backend/vif/11/0/hotplug-status. [2008-08-31 19:15:51 5531] DEBUG (XendDomainInfo:1897) XendDomainInfo.destroy: domid=11
   ...

Although the hvm domUs are still working, if I shut them down, they hang on start up, again with the vif problem.

I used the bittorrent example above to demonstrate it is at a certain traffic load, however if I do a large cp from the domU to an nfs share, it will fail almost instantly.

Restarting xend has no effect. The only thing I can fix the problem with is a reboot of the dom0.

Here is the dom0 kernel line:

   module /xen-2.6.21-noreal root=/dev/sda2 max_loop=255

(the noreal just refers to the realtek drivers being removed from the kernel, as I tried to use realteks own drivers on their website to resolve the problem, but the behaviour is the same).

Here is the domU cfg:

==========================================
kernel = '/xen/kernels/xen-2.6.25-pae'
ramdisk = '/etc/xen/kernels/initramfs-genkernel-x86-2.6.25-gentoo-r7-ich10'
extra = 'console=hvc0'

memory = '768'

disk = [
               'phy:sda7,hda3,w',
               'file:/xen/domains/zenayonswap.img,hdb,w'
]

name = 'zenayon'

vif = ['bridge=eth1, mac=00:16:3E:11:11:12']
root='/dev/xvda3'
cpu_cap = 100
#sdl=0
#acpi=0
#apic=0
localtime=1
================================

The dmesg of the dom0 and domU don't have any clues that I can see, nor log/messages, nor the xen logs.

I am at a bit of a loss as to how to diagnose this. All the other networking related issues seem to have been resolved in earlier releases and/or are related to routed mode. It seems to be related to the dom0 kernel or xend, as these are the things that haven't changed in my testing. Perhaps I have a setting in my dom0 kernel that is not compatible?

Thanks for any help,

Paul

**

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.