[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] domU network has sleeping sickness
Steven Timm wrote: > I've seen the same problem with my xen 3.1.0 setup. What > the Xen gurus are telling us is that this is a symptom of Xen dom0 > being busy and not servicing the network interrupts of the domu's > promptly. Their advice to us was to shift an application that > had been running on dom0 to another Xen instance to see if that > would help. We are in the process of implementing that solution now. > There is nothing running on my dom0's. They're only purpose is managing the domU's. On one of the problematic XEN-hosts is actually load on the three domU's, they are serving continous build systems. But another sleepy XEN-host with five domU's is more or less in pre-production state and idling. > By the way my system (Dell poweredge2950) has got broadcomm > inbuilt network cards, not Intel E1000 so it is unlikely that > it is a network driver specific issue. > > During these episodes of non-network connectivity, by the way, > it was not unusual to see the following kernel dump in dom0 > I do'nt find anything helpful or suspicious in any log. But maybe I'm missing it. I'm looking in dom0 in dmesg, messages, warn, xend-debug.log, xend.log and xen-hotplug.log and in the domU in dmesg, messages and warn. But after the bootup process there is more or less nothing important logged. > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: Call Trace: > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: <IRQ> > [<ffffffff8025 > 8269>] softlockup_tick+0xcc/0xde > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: > [<ffffffff8020e84d>] > timer_interrupt+0x3a3/0x401 > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: > [<ffffffff80258898>] > handle_IRQ_event+0x4b/0x93 > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: > [<ffffffff8025897e>] > __do_IRQ+0x9e/0x100 > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: > [<ffffffff8020cc97>] > do_IRQ+0x63/0x71 > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: > [<ffffffff8034b347>] > evtchn_do_upcall+0xee/0x165 > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: > [<ffffffff8020abca>] > do_hypervisor_callback+0x1e/0x2c > 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: <EOI> > > or > > Feb 25 10:32:39 fermigrid6 kernel: BUG: soft lockup detected on CPU#0! > Feb 25 10:32:39 fermigrid6 kernel: > Feb 25 10:32:39 fermigrid6 kernel: Call Trace: > Feb 25 10:32:39 fermigrid6 kernel: <IRQ> [<ffffffff80258269>] > softlockup_tick+0xcc/0xde > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020e84d>] > timer_interrupt+0x3a3/0x401 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff80258898>] > handle_IRQ_event+0x4b/0x93 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8025897e>] > __do_IRQ+0x9e/0x100 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020cc97>] do_IRQ+0x63/0x71 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8034b347>] > evtchn_do_upcall+0xee/0x165 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020abca>] > do_hypervisor_callback+0x1e/0x2c > Feb 25 10:32:39 fermigrid6 kernel: <EOI> [<ffffffff8020622a>] > hypercall_page+0x22a/0x1000 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020622a>] > hypercall_page+0x22a/0x1000 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8034b258>] > force_evtchn_callback+0xa/0xb > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff803f2272>] > thread_return+0xdf/0x119 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020622a>] > hypercall_page+0x22a/0x1000 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff80228a25>] > __cond_resched+0x1c/0x44 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff803f25df>] > cond_resched+0x37/0x42 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff802343c4>] > ksoftirqd+0x0/0xbf > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff80234432>] > ksoftirqd+0x6e/0xbf > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff802422d7>] > kthread+0xc8/0xf1 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020ae1c>] > child_rip+0xa/0x12 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8024220f>] kthread+0x0/0xf1 > Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020ae12>] > child_rip+0x0/0x12 > > ---------------- > > One of our dom0's was running an LVS server, the other one on > identical hardware was not. We moved the LVS server from one to the > other and > the network problems and kernel panics followed it. > > Steve Timm > > On Mon, 3 Mar 2008, Marc Teichgraeber wrote: > >> Hi all, >> >> I have a strange network problem with some domU's on three XEN-Hosts. >> They are loosing their network connectivity. I do bridged networking. >> * It happens randomly and could happen right after bootup of the domU >> or anytime later. >> * The domU is not reachable from another host on the LAN. >> * The domU is always reachable from the dom0 (ssh, ping). >> * I can 'repair' the connection when attaching to the console and >> ping out from the domU. First nothings happens, then the machine gets >> back their network. (And thats also my momentary workaround, pinging all >> the time from the console) >> * Pinging from another host at the same time helps too. >> * It could be that I can ping continously from one host and another >> hosts gets only every 10th packet or so back. >> * The interfaces could come back from their sleep by itself. >> * When the networks has fallen asleep, ssh on the domU from another >> host hangs, it does not come back with "no route to host" or something. >> >> I'm suspicious about the network controllers, they are the same on all >> hosts: "Intel Corporation 80003ES2LAN Gigabit Ethernet Controller >> (Copper)"(lspci) some kind of "Intel® PRO/1000 EB Network Connection >> with I/O Acceleration"(Intel website). I've tried the latest e1000 >> driver from Intel but it does'nt helped. >> I've checked all MAC Adresses, they are unique, also the IP Adresses. >> >> Any ideas are welcome :) >> >> ------------------------------------------------------------------------- >> >> "xm info" from host1, openSUSE 10.2 (X86-64): >> >> release : 2.6.18.8-0.9-xen >> version : #1 SMP Sun Feb 10 22:48:05 UTC 2008 >> machine : x86_64 >> nr_cpus : 4 >> nr_nodes : 1 >> sockets_per_node : 2 >> cores_per_socket : 2 >> threads_per_core : 1 >> cpu_mhz : 2327 >> hw_caps : >> bfebfbff:20100800:00000000:00000140:0004e3bd:00000000:00000001 >> total_memory : 32766 >> free_memory : 21607 >> max_free_memory : 21607 >> max_para_memory : 21603 >> max_hvm_memory : 21544 >> xen_major : 3 >> xen_minor : 0 >> xen_extra : .3_11774-23 >> xen_caps : xen-3.0-x86_64 >> xen_pagesize : 4096 >> platform_params : virt_start=0xffff800000000000 >> xen_changeset : 11774 >> cc_compiler : gcc version 4.1.2 20061115 (prerelease) (SUSE >> Linux) >> cc_compile_by : abuild >> cc_compile_domain : suse.de >> cc_compile_date : Thu Jan 10 21:22:54 UTC 2008 >> xend_config_format : 2 >> ------------------------------------------------------------------------- >> >> "xm info" output on host2, openSUSE 10.3 (X86-64) >> >> release : 2.6.22.13-0.3-xen >> version : #1 SMP 2007/11/19 15:02:58 UTC >> machine : x86_64 >> nr_cpus : 8 >> nr_nodes : 1 >> sockets_per_node : 2 >> cores_per_socket : 4 >> threads_per_core : 1 >> cpu_mhz : 3000 >> hw_caps : >> bfebfbff:20100800:00000000:00000140:0004e3bd:00000000:00000001 >> total_memory : 16382 >> free_memory : 591 >> max_free_memory : 591 >> max_para_memory : 587 >> max_hvm_memory : 577 >> xen_major : 3 >> xen_minor : 1 >> xen_extra : .0_15042-51 >> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p >> xen_scheduler : credit >> xen_pagesize : 4096 >> platform_params : virt_start=0xffff800000000000 >> xen_changeset : 15042 >> cc_compiler : gcc version 4.2.1 (SUSE Linux) >> cc_compile_by : abuild >> cc_compile_domain : suse.de >> cc_compile_date : Tue Sep 25 21:16:06 UTC 2007 >> xend_config_format : 4 >> >> > -- -------------------------------- Marc Teichgraeber Systemadministrator Systemadministration neofonie GmbH Robert-Koch-Platz 4 10115 Berlin fon: +49.30 24627 185 fax: +49.30 24627 120 marc.teichgraeber@xxxxxxxxxxx http://www.neofonie.de Handelsregister Berlin-Charlottenburg: HRB 67460 Geschaeftsfuehrung Helmut Hoffer von Ankershoffen Nurhan Yildirim -------------------------------- _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |