[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Nasty kernel panic


  • To: "Steven Timm" <timm@xxxxxxxx>
  • From: Asim <linkasim@xxxxxxxxx>
  • Date: Thu, 28 Aug 2008 19:35:01 -0500
  • Cc: xen-users@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 28 Aug 2008 17:35:39 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=XgslStGStW/aOUW6VWT7NaloC+f8qg4GQrYTiLd++q+UjDeX7C/+qiAHqdHFLpeVPL dntzgdi76oqT4z8XzS9O7perxpnIaam9nYtmw+5jME2hTpkubIpUqM3wbcdzft/FEUpw Q5nRMVgJGb6mMejEQza0IxXZQLErfba6LxCPM=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Hi,

I'm also using e1000 on my domUs. I have been keeping track of e1000
internal function calling sequences (as a part of my project). I do
not see any ipv6 relaated calls because it is disabled. I would
encourage you to double check whether ipv6 is actually disabled. I
don't remember exactly but it was a little difficult actually
disabling it.

Regards,
Asim
On 8/28/08, Steven Timm <timm@xxxxxxxx> wrote:
>
> I have seen the following kernel panic 5 times today on
> three different machines, two of which had been stable
> for months and one of which is a brand new install.
>
> We are running the x86_64 xen kernel and userland tools that came in the
> Xen 3.1.0
> tarball from xen.org, on top of scientific linux (redhat clone)
> 5.1 or 5.2.
>
>
> <Aug/28 12:21 pm>Unable to handle kernel NULL pointer dereference at
> 00000000000
> 000f4 RIP:
> <Aug/28 12:21 pm> [<ffffffff88256375>] :ipv6:rt6_select+0x38/0x1f4
> <Aug/28 12:21 pm>PGD 70010067 PUD 715bf067 PMD 0
> <Aug/28 12:21 pm>Oops: 0000 [1] SMP
> <Aug/28 12:21 pm>CPU 0
> <Aug/28 12:21 pm>Modules linked in: dell_rbu firmware_class ipmi_devintf
> ipmi_si
>   ipmi_msghandler mptctl mptbase nls_utf8 nfs lockd nfs_acl xt_physdev
> iptable_fi
> lter ip_tables x_tables bridge ipv6 autofs4 hidp rfcomm l2cap bluetooth
> sunrpc b
> infmt_misc dm_multipath video thermal sbs processor i2c_ec i2c_core fan
> containe
> r button battery asus_acpi ac parport_pc lp parport floppy ide_cd cdrom
> ide_flop
> py intel_rng joydev tsdev usbkbd usbmouse piix e752x_edac edac_mc sg e1000
> usbhi
> d pcspkr serio_raw siimage dm_snapshot dm_zero dm_mirror dm_mod ide_disk
> ata_pii
> x libata megaraid_mbox sd_mod scsi_mod megaraid_mm ext3 jbd ehci_hcd
> ohci_hcd uh
> ci_hcd usbcore
> <Aug/28 12:21 pm>Pid: 3075, comm: avahi-daemon Tainted: GF     2.6.18-xen
> #1
> <Aug/28 12:21 pm>RIP: e030:[<ffffffff88256375>]  [<ffffffff88256375>]
> :ipv6:rt6_
> select+0x38/0x1f4
> <Aug/28 12:21 pm>RSP: e02b:ffffffff80526b00  EFLAGS: 00010286
> <Aug/28 12:21 pm>RAX: ffff88006cbd6000 RBX: ffffffff88283580 RCX:
> 00000000000000
> 0d
> <Aug/28 12:21 pm>RDX: 0000000000000001 RSI: 000000000000000d RDI:
> ffff880070a3d4
> e0
> <Aug/28 12:21 pm>RBP: ffff880070a3d4c0 R08: ffffffff8824f148 R09:
> ffffffff80526b
> 60
> <Aug/28 12:21 pm>R10: ffffffff88293906 R11: ffff880061730180 R12:
> ffff880053e997
> 80
> <Aug/28 12:21 pm>R13: 0000000000000000 R14: 0000000000000003 R15:
> 00000000000000
> 01
> <Aug/28 12:21 pm>FS:  00002b34a5da6370(0000) GS:ffffffff804d3000(0000)
> knlGS:000
> 0000000000000
> <Aug/28 12:21 pm>CS:  e033 DS: 0000 ES: 0000
> <Aug/28 12:21 pm>Process avahi-daemon (pid: 3075, threadinfo
> ffff88006fc8a000, t
> ask ffff880000b0c860)
> <Aug/28 12:21 pm>Stack:  0000000080526bb8 00000000ffffffff
> 0000000000000000 0000
> 000000000000
> <Aug/28 12:21 pm> 0000000d00000001 ffff880070a3d4e0 ffffffff8824f148
> ffffffff882
> 83580
> <Aug/28 12:21 pm> ffff880070a3d4c0 ffff880053e99780 0000000000000000
> 00000000000
> 00003
> <Aug/28 12:21 pm>Call Trace:
> <Aug/28 12:21 pm> <IRQ> [<ffffffff8824f148>] :ipv6:ip6_rcv_finish+0x0/0x28
> <Aug/28 12:21 pm> [<ffffffff882568e7>] :ipv6:ip6_route_input+0x70/0x1cf
> <Aug/28 12:21 pm> [<ffffffff8824f3c5>] :ipv6:ipv6_rcv+0x255/0x2ba
> <Aug/28 12:21 pm> [<ffffffff80395cbc>] netif_receive_skb+0x2d3/0x2f3
> <Aug/28 12:21 pm> [<ffffffff8828f9b4>] :bridge:br_pass_frame_up+0x64/0x66
> <Aug/28 12:21 pm> [<ffffffff8828fa7a>]
> :bridge:br_handle_frame_finish+0xc4/0xf6
> <Aug/28 12:21 pm> [<ffffffff88292e57>]
> :bridge:br_nf_pre_routing_finish_ipv6+0xd
> f/0xe3
> <Aug/28 12:21 pm> [<ffffffff882935e6>]
> :bridge:br_nf_pre_routing+0x39b/0x667
> <Aug/28 12:21 pm> [<ffffffff803ad73c>] nf_iterate+0x52/0x79
> <Aug/28 12:21 pm> [<ffffffff8828f9b6>]
> :bridge:br_handle_frame_finish+0x0/0xf6
> <Aug/28 12:21 pm> [<ffffffff803ad7d6>] nf_hook_slow+0x73/0xea
> <Aug/28 12:21 pm> [<ffffffff8828f9b6>]
> :bridge:br_handle_frame_finish+0x0/0xf6
> <Aug/28 12:21 pm> [<ffffffff8828fc43>] :bridge:br_handle_frame+0x167/0x190
> <Aug/28 12:21 pm> [<ffffffff80395c14>] netif_receive_skb+0x22b/0x2f3
> <Aug/28 12:21 pm> [<ffffffff88107a79>]
> :e1000:e1000_clean_rx_irq+0x430/0x4d5
> <Aug/28 12:21 pm> [<ffffffff881074ec>] :e1000:e1000_clean+0x82/0x160
> <Aug/28 12:21 pm> [<ffffffff80395f51>] net_rx_action+0xe7/0x254
> <Aug/28 12:21 pm> [<ffffffff80233d97>] __do_softirq+0x7b/0x10d
> <Aug/28 12:21 pm> [<ffffffff8020b094>] call_softirq+0x1c/0x28
> <Aug/28 12:21 pm> [<ffffffff8020cdfd>] do_softirq+0x62/0xd9
> <Aug/28 12:21 pm> [<ffffffff8020cc9c>] do_IRQ+0x68/0x71
> <Aug/28 12:21 pm> [<ffffffff8034b347>] evtchn_do_upcall+0xee/0x165
> <Aug/28 12:21 pm> [<ffffffff8020abca>] do_hypervisor_callback+0x1e/0x2c
> <Aug/28 12:21 pm> <EOI>
>
> <Aug/28 12:21 pm>Code: 41 8b 85 f4 00 00 00 4d 85 ed 4d 89 ec 89 44 24 0c
> 0f 84
> 36
> <Aug/28 12:21 pm>RIP  [<ffffffff88256375>] :ipv6:rt6_select+0x38/0x1f4
> <Aug/28 12:21 pm> RSP <ffffffff80526b00>
> <Aug/28 12:21 pm>CR2: 00000000000000f4
> <Aug/28 12:21 pm> <0>Kernel panic - not syncing: Aiee, killing interrupt
> handler
>
>
> ------------------------------------------------
>
> There are different processes pid's that show as the triggering process
> but the base error is the same.  A couple times it is triggered by the
> swapper.
>
> What is puzzling is the references to ipv6 which I was pretty sure I
> have disabled everywhere.  To be clear these crashes
> are from the dom0, and when it happens the dom0 hangs and does
> not auto-reboot, it requires a reset.
>
> Any ideas?  This config has been pretty stable for us on 7
> different machines including these ones.  A couple of times it happened
> just about the time we were shutting down a xen domU, a couple
> other times today it happened on a machine that I wasn't even working on.
>
> Steve Timm
>
>
>
> --
> ------------------------------------------------------------------
> Steven C. Timm, Ph.D  (630) 840-8525
> timm@xxxxxxxx  http://home.fnal.gov/~timm/
> Fermilab Computing Division, Scientific Computing Facilities,
> Grid Facilities Department, FermiGrid Services Group, Assistant Group
> Leader.
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users
>

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.