[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] kernel oops/IRQ exception when networking between many domUs



Hi,

I try to build experimental networks with Xen and stumbled over the same
problem that has been described quite well by Mark Doll in his posting
"xen_net: Failed to connect all virtual interfaces: err=-100"
here:

http://lists.xensource.com/archives/html/xen-users/2005-04/msg00447.html

As it was still present in 2.0.6, I tried 3.0-devel and found NR_PIRQS
and NR_DYNIRQS had been adjusted there - so I hoped for the best.  I was
then able to fire up my virtual test network and get it running with 20
nodes and approx. 120 interfaces, without problems at first. The vifs
are wired to ~60 bridge interfaces, 2 vifs each, and I can access all
domU-nodes with the console etc.  Kernel version is 2.6.11 coming with
xen-unstable as of May 31.

The problem: after allowing free packet delivery within the network by
issueing a

  sysctl -w net.bridge.bridge-nf-call-iptables=0

(which was until then set to 1 and my iptables rules blocked all
traffic), the whole machine froze after a very short time (immediately
to 2-3 seconds), apparently when the first packet is traveling through
the network.  No output, kernel oops, nothing to see, and magic sysrq
gone as well(!).  This behaviour was deterministic.  I had quite some
difficulties getting more information - what I finally did was to set
the sysctl *before* starting the domUs.  Funnily, nothing happend after
starting the first 10-12 nodes, but after "xm create"ing one or two more
nodes, the system oopsed with at least some info, but sysrq gone as
well.  So I wrote it down on a peace of paper ;-) , hopefully someone
can make sense of it:


Stack: 
 00000000 d06cea20 2f001020 c8b04780 c0403f1c c028cbfa 0002f001 0000000d
 ffffffff 08b78020 00000052 00000001 00000028 0000005e 00008b85 d21fe000
 00000006 c0457824 0000011d c0453240 00283d58 e01c3a6e c0403cec da6bccd0

Call Trace:
 [<c0109c51>] show_stack+0x80/0x96
 [<c0100de1>] show_registers+0x15a/0x1d1
 [<c010a001>] die+0x106/0x1c4
 [<c010a4aa>] do_invalid_op+0xb5/0xbf
 [<c010985b>] error_code+0x2b/0x30
 [<c028cbfa>] net_rx_action+0x484/0x4df
 [<c01239a9>] tasklet_action+0x7b/0xe0
 [<c0123533>] __do_softirq+0x6f/0xef
 [<c0123632>] do_softirq+0x7f/0x97
 [<c0123706>] irq_exit+0x3a/0x3c
 [<c010d819>] do_IRQ+0x25/0x2c
 [<c0105efe>] evtchn_do_upcall+0x62/0x82
 [<c010988c>] hypervisor_callback+0x2c/0x34
 [<c0107673>] cpu_idle+0x33/0x41
 [<c04047a9>] start_kernel+0x196/0x1e8
 [<c010006c>] 0xc010006c

Code:  08 a8 75 30 83 c4 5b 5e 5f 5d c3 bb 01 00 00 00 31 f6 b8 0c 00 00
00 bf  f0 7f 00 00 8d 4d 08 89 da cd 82 83 e8 01 2e 74 8e <0f> 0b 66 00
2c 7a 35 c0 eb 84 e8 f8 b1 09 00 eb c9 e8 f6 98 e7

<0>Kernel panic - not syncing: Fatal exception in interrupt



Any suggestions?


Regards,

Birger

PS.: I attach the scripts starting the virtual network for the
interested user.  Beware, they have no decent design but are mere hacks.
The root filesystem used is available here:

  http://www.iem.uni-due.de/~birger/downloads/root_fs

Attachment: autobrctl
Description: application/shellscript

Attachment: vnconfig.xml
Description: Text Data

Attachment: xvn
Description: application/shellscript

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.