[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] network hang trigger

  • To: "Bin Ren" <br260@xxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxxx>
  • From: "James Harper" <JamesH@xxxxxxxxxxxxxxxx>
  • Date: Thu, 16 Sep 2004 10:48:19 +1000
  • Delivery-date: Thu, 16 Sep 2004 02:06:57 +0100
  • List-id: List for Xen developers <xen-devel.lists.sourceforge.net>
  • Thread-index: AcSbgCTA5ZH4G8YFQH2wMIUUuQepsAABJJVg
  • Thread-topic: [Xen-devel] network hang trigger

This patch makes no difference on my system. Looking at the line numbers
in your patch, your netfront.c appears to be a different vintage to
mine. I applied the patch manually and then rebuilt the xenU kernel and
then booted a domain with it. I haven't touched xen0. Is this the
correct thing to be doing?

Ping >mtu from xenU to xen0 causes the network to hang immediately.
Ping >mtu from xen0 to xenU sometimes causes the network to hang, but
not always. 'ping -i 0.1 -s 6000 <xenU ip>' will mostly cause the hang
in under 30 iterations.

The recovery time appears to be in the order of 60 seconds or so, with a
partial recovery and then relapse at about 30 seconds.
When I was thinking about this problem, I was imagining a deadlock
condition where rapid back to back packets (eg a fragmented icmp packet
from ping or a fragmented udp packet from nfs) was causing a hang until
part of the deadlock timed itself out and the packets started flowing
again. I couldn't see 1 packet causing a buffer exhaustion unless it got
itself into a loop where it kept retrying to send the second fragment
and didn't free the buffer each time, but even then the buffer bug would
be a side effect.

The deadlock would have to be caused in the transmit from xenU to xen0,
and something about the difference between sending a ping and responding
to a ping is the difference between always causing a lockup and only
sometimes causing a lockup.

Maybe we're seeing different manifestations of the same problem?


> -----Original Message-----
> From: xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
> admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Bin Ren
> Sent: Thursday, 16 September 2004 09:54
> To: xen-devel@xxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] network hang trigger
> I've just modified the netfront.c (haven't touched netback.c yet).
> done two modifications: (1) free sk_buff properly on the transmit path
> In netif_poll(...) function, packets **should not** be passed to
> netif_rx(); instead, use: int netif_receive_skb(struct sk_buff *skb).
> With these two modifications, under 'ping -s 6000', network only
> occasionally loses a few packets but *very soon* recovers. It's much
> stable than before. I'll take a closer look at netfront.c and
> tomorrow.
> Here is the patch. Please try it out. With your results, the changes
> get pushed into the repository.

This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.
Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.