[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Nasty kernel panic
A couple people have pointed at the e1000 driver as a possible culprit and given good reasons why that should be the case..my only question is why did I also get the same kernel panic on the new poweredge 2950 which doesn't have intel e1000 but broadcomm drivers and nics? By the way, all the systems in question have now been up for 18 hours and functioning fine so once we got first the rsyncing done, and then the squid servers all re-initialized correctly, we have been OK since then. I am away from the office but I will follow up the thread and post the kernel config later. Steve Timm ------------------------------------------------------------------ Steven C. Timm, Ph.D (630) 840-8525 timm@xxxxxxxx http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader. On Fri, 29 Aug 2008, Tim Post wrote: Hi Steve, On Thu, 2008-08-28 at 16:52 -0500, Steven Timm wrote:I have seen the following kernel panic 5 times today on three different machines, two of which had been stable for months and one of which is a brand new install.[snip]<Aug/28 12:21 pm> [<ffffffff88107a79>] :e1000:e1000_clean_rx_irq+0x430/0x4d5 <Aug/28 12:21 pm> [<ffffffff881074ec>] :e1000:e1000_clean+0x82/0x160 <Aug/28 12:21 pm> [<ffffffff80395f51>] net_rx_action+0xe7/0x254 <Aug/28 12:21 pm> [<ffffffff80233d97>] __do_softirq+0x7b/0x10d <Aug/28 12:21 pm> [<ffffffff8020b094>] call_softirq+0x1c/0x28 <Aug/28 12:21 pm> [<ffffffff8020cdfd>] do_softirq+0x62/0xd9 <Aug/28 12:21 pm> [<ffffffff8020cc9c>] do_IRQ+0x68/0x71 <Aug/28 12:21 pm> [<ffffffff8034b347>] evtchn_do_upcall+0xee/0x165 <Aug/28 12:21 pm> [<ffffffff8020abca>] do_hypervisor_callback+0x1e/0x2c <Aug/28 12:21 pm> <EOI> <Aug/28 12:21 pm>Code: 41 8b 85 f4 00 00 00 4d 85 ed 4d 89 ec 89 44 24 0c 0f 84 36 <Aug/28 12:21 pm>RIP [<ffffffff88256375>] :ipv6:rt6_select+0x38/0x1f4 <Aug/28 12:21 pm> RSP <ffffffff80526b00> <Aug/28 12:21 pm>CR2: 00000000000000f4 <Aug/28 12:21 pm> <0>Kernel panic - not syncing: Aiee, killing interrupt handlerIt looks like e1000 might be being spit out. From what I gather in your message, the only thing that changed was you are now putting a much higher I/O demand on the drives (rsyncing everything), by extension this increases the demand on the NIC. If the e1000 nic is the one enslaved to the bridge, it could be clean up that's making it freak when a guest stops. If its ejected uncleanly, the PID next in line with pending i/o for the device will likely be identified as the culprit. I had a very similar problem with a buggy Areca driver on dom-0 a couple of years ago. Can you post a link to your kernel's .config, or perhaps try the latest stable version of that module from: http://sourceforge.net/project/showfiles.php?group_id=42302 As for ipv6, if its being set up you'll see it in /etc/sysconfig or /etc/network (depending on the distro) pretty clearly. However, that shouldn't make a difference .. it should work either way. Hope this helps :) Cheers! --Tim -- Monkey + Typewriter = Echoreply ( http://echoreply.us ) _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |