[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Tapdisk failures / kernel general protection fault at xen 4.0.2rc3 / kernel pvops 2.6.32.36



On Thu, 2011-04-14 at 09:15 -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Apr 13, 2011 at 06:02:13PM -0300, Gerd Jakobovitsch wrote:
> > I'm trying to run several VMs (linux hvm, with tapdisk:aio disks at
> > a storage over nfs) on a CentOS system, using the up-to-date version
> > of xen 4.0 / kernel pvops 2.6.32.x stable. With a configuration
> > without (most of) debug activated, I can start several instances -
> > I'm running 7 of them - but shortly afterwards the system stops
> > responding. I can't find any information on this.
> 
> First time I see it.
> > 
> > Activating several debug configuration items, among them
> > DEBUG_PAGEALLOC, I get an exception as soon as I try to start up a
> > VM. The system reboots.
> 
> Oooh, and is the log below from that situation?
> 
> Daniel, any thoughs?

---
          Unmap pages from the kernel linear mapping after free_pages().
          This results in a large slowdown, but helps to find certain types
          of memory corruption.

Stunning. Our I/O page allocator is a sort of twisted mempool. Unless
the allocation is explicitly modified in sysfs/, everything should stay
pinned. We might be just tripping over debug code alone, but I didn't
figure it out yet.

Daniel

> > 
> > Below the log from /var/log/messages:
> > 
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Created
> > /dev/xen/blktap-2/control device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Created
> > /dev/xen/blktap-2/blktap0 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Created
> > /dev/xen/blktap-2/tapdev0 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: new interface: ring:
> > 251, device: 253, minor: 0
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: I/O queue driver: lio
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: block-aio 
> > open('/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda')
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: 
> > open(/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda)
> > with O_DIRECT
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Image size:       pre
> > sector_shift  [134217728]   post sector_shift [262144]
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: opened image
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda (1
> > users, state: 0x00000001, type: 0)
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: VBD CHAIN:
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]:
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda: 0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.158549] block tda:
> > sector-size: 512 capacity: 262144
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200514] general
> > protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200703] last sysfs
> > file: /sys/block/tda/removable
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200761] CPU 0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200847] Modules linked
> > in: bridge stp bonding bnx2i libiscsi scsi_transport_iscsi cnic uio
> > bnx2 megaraid_sas
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201363] Pid: 4988,
> > comm: tapdisk2 Not tainted 2.6.32.36 #3 PowerEdge M610
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201424] RIP:
> > e030:[<ffffffff812b9c24>]  [<ffffffff812b9c24>]
> > blktap_device_end_request+0x49/0x5e
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201543] RSP:
> > e02b:ffff88006a7f7cd8  EFLAGS: 00010046
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201600] RAX:
> > 6b6b6b6b6b6b6b6b RBX: ffff88006a6fc000 RCX: ffff88006a7f7c38
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201662] RDX:
> > 0000000000000000 RSI: 0000000000000000 RDI: ffff88006a5c3500
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201723] RBP:
> > ffff88006a7f7cf8 R08: ffffffff818383c0 R09: ffff88006a7f7c38
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201784] R10:
> > 0000000000000000 R11: ffff88007b697b18 R12: ffff88007b697b18
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201845] R13:
> > ffff88006a5c3360 R14: 0000000000000000 R15: ffff88006a5c3370
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201910] FS:
> > 00007f50a9445730(0000) GS:ffff8800280c7000(0000)
> > knlGS:0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201974] CS:  e033 DS:
> > 0000 ES: 0000 CR0: 000000008005003b
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202032] CR2:
> > 00007fb35d12e6e8 CR3: 000000006a4ce000 CR4: 0000000000002660
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202093] DR0:
> > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202154] DR3:
> > 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202436] Process
> > tapdisk2 (pid: 4988, threadinfo ffff88006a7f6000, task
> > ffff88006b5a0000)
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202941] Stack:
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.203206]
> > ffff88006b5a0000 0000000000000000 0000000000000000 0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.203609] <0>
> > ffff88006a7f7e88 ffffffff812b9416 ffff88006a6c80f8 0000000100000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.204310] <0>
> > 00000000ffffffff ffff88006a5c3360 000000017edd7ab0 0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.205284] Call Trace:
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.205553]
> > [<ffffffff812b9416>] blktap_ring_ioctl+0x183/0x2d8
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.205838]
> > [<ffffffff81209a64>] ? inode_has_perm+0xa1/0xb3
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206120]
> > [<ffffffff8157641f>] ? _spin_unlock+0x26/0x2a
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206400]
> > [<ffffffff81126ff9>] ? aio_read_evt+0x56/0xe0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206678]
> > [<ffffffff81127071>] ? aio_read_evt+0xce/0xe0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206957]
> > [<ffffffff8124f5c1>] ? _raw_spin_lock+0x77/0x12d
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.207236]
> > [<ffffffff81209bf8>] ? file_has_perm+0xb4/0xc6
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.207516]
> > [<ffffffff8110464e>] vfs_ioctl+0x5e/0x77
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.207793]
> > [<ffffffff81104b63>] do_vfs_ioctl+0x484/0x4d5
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.208069]
> > [<ffffffff81104c0b>] sys_ioctl+0x57/0x7a
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.208346]
> > [<ffffffff81012cc2>] system_call_fastpath+0x16/0x1b
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.208621] Code: 89 de 4c
> > 89 ef e8 60 f4 ff ff 49 8b 44 24 40 48 8b b8 90 04 00 00 e8 41 c9 2b
> > 00 44 89 f6 4c 89 e7 e8 39 fc ff ff 49 8b 44 24 40 <48> 8b b8 90 04
> > 00 00 e8 66 c7 2b 00 5b 41 5c 41 5d 41 5e c9 c3
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.211986] RIP
> > [<ffffffff812b9c24>] blktap_device_end_request+0x49/0x5e
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.212306]  RSP <ffff88006a7f7cd8>
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.212579] ---[ end trace
> > b97070122f44735d ]---
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: Created
> > /dev/xen/blktap-2/blktap1 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: Created
> > /dev/xen/blktap-2/tapdev1 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: new interface: ring:
> > 251, device: 253, minor: 1
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: I/O queue driver: lio
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: block-aio 
> > open('/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda')
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: 
> > open(/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda)
> > with O_DIRECT
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: Image size:       pre
> > sector_shift  [10737418240]         post sector_shift [20971520]
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: opened image
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda (1
> > users, state: 0x00000001, type: 0)
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: VBD CHAIN:
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]:
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda: 0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.317931] block tdb:
> > sector-size: 512 capacity: 20971520
> > 
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.