[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] XEN - Broadcom issue: survey



We have seen similar problems on Xen 2 (based on NetBSD 3.01) and Xen 3.0.3 (based on Fedora 7). It does not appear on Xen 3.0.3 on Debian Etch.

A work-around appears to be to transfer files in smaller fragments or smaller block sizes. For example, we see this repeatedly when NFS mounting from a central NAS server to domU's. By using UDP rather than TCP, this problem occurs much less frequently. It appears to be a protocol buffer problem between the bridge and TCP layers on the emulated network. It does not appear on native NetBSD, Fedora7 or Debian systems.

-Steve Senator
 sts+xen@xxxxxxxxxxxxxxxxxxxxxx


Quoting Boudreau Luc <luc.boudreau@xxxxxxxxxxxx>:

A bit more information on this issue. We decided to buy another NIC (other than Broadcom). The part number is NC110T from HP. It's an Intel gigabit server NIC. The problem still happens, thus eliminating the NIC problem. The card has the latest firmware and the latest drivers (e1000 ver. 7.6.9.1-1).

The problem is still happening when we transfer large files through a domU->External. It doesn't happen when transferring dom0->External. It is not a simple tcp_timewait issue since the problem doesn't resolve itself after the tcp timeout.

Is there anything I can test from me new setup that would help investigate ?

______________________________________________________

Luc Boudreau
Registrariat, Université de Montréal


-----Message d'origine-----
De : xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] De la part de Pezza
Envoyé : 15 novembre 2007 21:44
À : xen-users@xxxxxxxxxxxxxxxxxxx
Objet : Re: [Xen-users] XEN - Broadcom issue: survey


Steven,

thank you for your suggestions.


Steven Smith-9 wrote:

vifX.Y interfaces are only used to send packets to PV network devices
in the guest.  Pure HVM domains (those without any PV drivers) send
all packets over the relevant tapX interface instead.  Errors observed
on the vif interface are therefore completely irrelevant in this case.
If the tap device has nothing strange then you'll have to look
somewhere else.

Ok that's a very good hint.

So far, this is the status:

Steven Smith-9 wrote:

-- Do you see the same problems with dom0<->domU networking?  If so,
it would be a good idea to fix that before worrying about problems
with the NIC.  Packets which don't need to leave the host don't touch
the physical hardware.

Dom0<->DomU is showing the same problem and, yes, you're right: probably
it's not a network card related issue at this point...


Steven Smith-9 wrote:

-- I understand you're seeing connections stall for significant
periods of time, and that this happens across a wide variety of
services, yes?  It would be interesting to know if other connections
to the same VM continue working when this happens.

Yes they do.


Steven Smith-9 wrote:

-- Is there a firewall enabled in the guest?  Turning it off might
help.  The dom0 firewall might also be relevant, although that's less
likely.

I disabled firewalling in Dom0 and in DomU to take it out of the loop.

I tried again with another machine (which is running Xen 3.0.4), and, on the
same network (which is a gigabit network), it works fine. It's slow of
course (no PV), but there's no corruption and it's stable.

I'm willing to try to uninstall Xen 3.1 and try with 3.0.3 (the current Xen
release for CentOS 5), maybe there is something else hidden somewhere in the
background.


M.



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.