[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Re: xen bonding and network performance dropping to ~ 0.1%



On Sun, Apr 08, 2007 at 09:45:18AM +0300, Igor Chubin wrote:
> On Sa, Apr 07, 2007 at 08:27:06 +0200, Axel Thimm wrote:
> > On Sat, Apr 07, 2007 at 05:45:24PM +0200, Axel Thimm wrote:
> > > after some patching of the xen scripts (to properly `migrate' slaves
> > > over from bond0 to pbond0), I have xen and bonding working. But only
> > > for 20-25 seconds, after that the network throughput suddenly falls
> > > from for example 110MB/sec to 70-120KB/sec, e.g. about a factor of
> > > thousand. Stopping the network bridge restores the throughput, but
> > > again after a short delay of 0.5-1 minute.
> > > 
> > > Does that ring a bell? What can be the troublemaker and why does it
> > > appear with such a great delay? There is no hint in the logs on why
> > > the performance drops that dramatically.
> > 
> > I checked where the packets got dropped by checking ICMP traffic on
> > 
> > o eth0,eth1 the two slaves
> > o pbond0 the physical bond of these two
> > o xenbr0
> > o bond0, aka veth0
> > 
> > While the network works well, the ICMP requests/replies can be seen on
> > all interfaces [1]. When the network breaks down to below 1% of its
> > bandwidth I can see the external ICMP requests reaching as far as
> > xenbr0. The virtual interface bond0 does not see the packets anymore.
> > 
> > So it looks like the bridge is leaking the packets, even after the
> > packets have passed into the bridge through the bonded device. This
> > makes it even more mysterious, since if the issue was bonding &
> > bridges I would expect the packets to drop on the incoming side of
> > the bridge.
> > 
> 
> Are you aware of this issue?
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=753
> 
> May be your problem is related to this?

Yes, I've seen this report, and while the vlan parts and the oops are
not releavnt to my case, the explanation in
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=753#c12 seems
to match, but the suggested solution is to apply two patches that have
already been applied in kernels >= 2.6.17 and I see this on kernels
2.6.18 (RHEL5) and 2.6.20 (FC6).
-- 
Axel.Thimm at ATrpms.net

Attachment: pgpKnczuFmQPK.pgp
Description: PGP signature

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.