[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] segfaulting tapdisk2 process leads to kernel oops



Hi there,

I just found a segfaulting tapdisk2 process which led into a kernel oops.

[1527071.169682] tapdisk2[26548]: segfault at 7fffd324cfe8 ip 000000000040837f 
sp 00007fffd324cff0 error 6 in tapdisk2[400000+38000]
[1527071.220104] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000048
[1527071.220170] IP: [<ffffffff810ce73c>] apply_to_page_range+0x47/0x2f3
[1527071.220210] PGD 1e9a9067 PUD 1e9a8067 PMD 0
[1527071.220250] Oops: 0000 [#1] SMP
[1527071.220282] last sysfs file: /sys/devices/virtual/blktap2/blktap0/remove
[1527071.220315] CPU 0
[1527071.220340] Modules linked in: xt_state xt_physdev tun ip6table_filter 
ip6_tables nfs
lockd fscache nfs_acl auth_rpcgss sunrpc bridge stp ipt_REJECT xt_tcpudp 
iptable_nat
nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables 
x_tables
blktap xen_blkfront xenfs xen_evtchn quota_v2 quota_tree psmouse evdev 
serio_raw snd_pcsp
i2c_i801 rng_core snd_pcm snd_timer i2c_core iTCO_wdt iTCO_vendor_support snd 
soundcore
snd_page_alloc i5000_edac edac_core i5k_amb button processor acpi_processor 
shpchp
pci_hotplug ext3 jbd sd_mod crc_t10dif uhci_hcd ehci_hcd mptsas usbcore 
nls_base mptscsih
mptbase scsi_transport_sas tg3 libphy thermal fan thermal_sys dm_snapshot 
dm_mirror
dm_region_hash dm_log dm_mod
[1527071.220854] Pid: 26548, comm: tapdisk2 Not tainted 2.6.32-ucs48-xen-amd64 
#1 PRIMERGY
BX620 S4
[1527071.220904] RIP: e030:[<ffffffff810ce73c>]  [<ffffffff810ce73c>]
apply_to_page_range+0x47/0x2f3
[1527071.220958] RSP: e02b:ffff880000d11b58  EFLAGS: 00010202
[1527071.224008] RAX: 0000000000000880 RBX: ffff88001e12c000 RCX: 
ffff88001e12d000
[1527071.224008] RDX: 0000000000000000 RSI: ffff88001e12c000 RDI: 
0000000000000000
[1527071.224008] RBP: ffff88001f31ad30 R08: 0000000000000000 R09: 
ffff88001f907840
[1527071.224008] R10: ffffffff81324b0a R11: 0000000000000000 R12: 
0000000000000000
[1527071.224008] R13: ffff88001f31ad30 R14: ffff88001d82c800 R15: 
0000000000000000
[1527071.224008] FS:  00007fa53151a730(0000) GS:ffff8800035cb000(0000)
knlGS:0000000000000000
[1527071.224008] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[1527071.224008] CR2: 0000000000000048 CR3: 000000001e9ae000 CR4: 
0000000000002660
[1527071.224008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[1527071.224008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[1527071.224008] Process tapdisk2 (pid: 26548, threadinfo ffff880000d10000, 
task
ffff88001dd41c40)
[1527071.224008] Stack:
[1527071.224008]  0000000000000040 ffff88001b189180 0000000000000000 
0000000000000000
[1527071.224008] <0> ffffffffa02a7ee8 0000000000000000 ffffffff8100ece2 
ffff88001eb0f240
[1527071.224008] <0> ffff88001e12d000 0000000000000000 0000000000000000 
ffff88001eb0f240
[1527071.224008] Call Trace:
[1527071.224008]  [<ffffffffa02a7ee8>] ? blktap_umap_uaddr_fn+0x0/0x59 
[blktap]
[1527071.224008]  [<ffffffff8100ece2>] ? check_events+0x12/0x20
[1527071.224008]  [<ffffffffa02a92a5>] ? blktap_device_end_request+0xbd/0x145 
[blktap]
[1527071.224008]  [<ffffffffa02a7743>] ? blktap_ring_vm_close+0x60/0xd1 
[blktap]
[1527071.224008]  [<ffffffff810d1394>] ? remove_vma+0x2c/0x72
[1527071.224008]  [<ffffffff810d1503>] ? exit_mmap+0x129/0x148
[1527071.224008]  [<ffffffff8104cc75>] ? mmput+0x3c/0xdf
[1527071.224008]  [<ffffffff8105087a>] ? exit_mm+0x102/0x10d
[1527071.224008]  [<ffffffff8132448a>] ? _spin_lock_irq+0x7/0x22
[1527071.224008]  [<ffffffff810522a3>] ? do_exit+0x1f8/0x6c6
[1527071.224008]  [<ffffffff8105d5bb>] ? __dequeue_signal+0xfb/0x124
[1527071.224008]  [<ffffffff8100eccf>] ? xen_restore_fl_direct_end+0x0/0x1
[1527071.224008]  [<ffffffff810e7ebd>] ? kmem_cache_free+0x72/0xa3
[1527071.224008]  [<ffffffff810527e7>] ? do_group_exit+0x76/0x9d
[1527071.224008]  [<ffffffff8105f0c5>] ? get_signal_to_deliver+0x318/0x343
[1527071.224008]  [<ffffffff8101104f>] ? do_notify_resume+0x87/0x73f
[1527071.224008]  [<ffffffff810d157d>] ? expand_downwards+0x5b/0x15b
[1527071.224008]  [<ffffffff8132694b>] ? do_page_fault+0x1f5/0x2fc
[1527071.224008]  [<ffffffff810125dc>] ? retint_signal+0x48/0x8c
[1527071.224008] Code: 48 89 4c 24 20 4c 89 44 24 18 48 89 54 24 40 72 04 0f 
0b eb fe 48
8b 54 24 28 48 89 f0 48 8b 4c 24 40 48 c1 e8 24 25 f8 0f 00 00 <48> 8b 52 48 
48 ff c9 48
89 0c 24 48 01 d0 48 89 44 24 30 48 b8
[1527071.224008] RIP  [<ffffffff810ce73c>] apply_to_page_range+0x47/0x2f3
[1527071.224008]  RSP <ffff880000d11b58>
[1527071.224008] CR2: 0000000000000048
[1527071.224008] ---[ end trace 7b79961eab7bea21 ]---

Kernel: current 2.6.32-ucs48-xen-amd64 from debian-based UCS 2.4-4

Unfortunately I don't have any further information towards that process. 
Neither do I know which image the tapdisk-process handled nor can I find any 
coherences to any DomU's shutting down or starting...

In kern.og I can find several lines with i/o errors before the segfault:
[1526966.852693] end_request: I/O error, dev tda, sector 62916088

I can reproduce such messages by removing an image while the DomU is running. 
But till now I didn't get the segfault or kernel oops again. I also tried to 
shut the DomU down or migrate it (...with lost image)...

Perhaps another important information: The images are located at a NetApp 
storage over NFS in that environment.

Does anybody also got that Oops till now or has further information for me?

Kind regards,
Tim

-- 
Tim Petersen
Support Engineer

Univention GmbH
Linux for your business
Mary-Somerville-Str.1
28359 Bremen
Tel. : +49 421 22232-0
Fax : +49 421 22232-99

petersen@xxxxxxxxxxxxx
http://www.univention.de

Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.