[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM


  • To: Gerd Jakobovitsch <gerd@xxxxxxxxxxx>
  • From: Teck Choon Giam <giamteckchoon@xxxxxxxxx>
  • Date: Wed, 13 Apr 2011 00:41:24 +0800
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
  • Delivery-date: Tue, 12 Apr 2011 09:43:12 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=H4Ig6qJu8pzJzDsnjXnFQAa194h07/DY2gyqaXuLPKs2gOv8kyeYoA4nyGZQgPqU9y 4E8tAA22vfHtRmK14Di1zcSQtyvT/vzNX6xIqCcgqx/WeFnHFL6r3NJ4i1Bj/02Iy3HR Xi92f19PvigpSlmEWyEJv6JIO15UMShnFrl3Y=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Tue, Apr 12, 2011 at 7:19 PM, Gerd Jakobovitsch <gerd@xxxxxxxxxxx> wrote:
> Yes. It is the same scenario from previous bug.
>
>
> On 04/12/2011 07:59 AM, Konrad Rzeszutek Wilk wrote:
>>
>> On Fri, Apr 08, 2011 at 06:38:38PM -0300, Gerd Jakobovitsch wrote:
>>>
>>> One more follow-up:
>>>
>>> Another kernel bug report, with no kernel debug activated:
>>
>> This is just with the guest starting right? Not running for a long time?

Since you are running NFS related in storage and the latest
xen/stable-2.6.32 commit has such fix backported
http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=ae333e97552c81ab10395ad1ffc6d6daaadb144a.
 Are you able to run your test with this?

>>
>>> r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761622] BUG: unable to
>>> handle kernel NULL pointer dereference at 00000000000002f0
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761781] IP:
>>> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761892] PGD 710a3067
>>> PUD 724c6067 PMD 0
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762076] Oops: 0000 [#1] SMP
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762212] last sysfs
>>> file: /sys/devices/vbd-6-51712/statistics/wr_sect
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762271] CPU 5
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762363] Modules linked
>>> in: bnx2 xt_mac bridge stp nfs fscache nfs_acl auth_rpcgss
>>> arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport
>>> lockd sunrpc bonding dm_multipath megaraid_sas [last unloaded: bnx2]
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763350] Pid: 7781,
>>> comm: tapdisk2 Not tainted 2.6.32.36 #5 PowerEdge M610
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763410] RIP:
>>> e030:[<ffffffff8121e7f9>]  [<ffffffff8121e7f9>]
>>> blktap_device_end_request+0x4e/0x6c
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763519] RSP:
>>> e02b:ffff88006ed49cf8  EFLAGS: 00010046
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763574] RAX:
>>> 0000000000000000 RBX: ffff88005e6fc3e0 RCX: ffffffff811a3de6
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763635] RDX:
>>> ffff88005e6fc3e0 RSI: ffffffff8149bed6 RDI: ffff880070f42178
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763694] RBP:
>>> ffff880070f42010 R08: ffffffff81661840 R09: 00000001002c9435
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763971] R10:
>>> 0000000000000000 R11: ffff88005e6378f0 R12: ffff88005e6378f0
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764244] R13:
>>> ffff880070f42000 R14: 0000000000000000 R15: 0000000000000001
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764522] FS:
>>> 00007fcfd34b0730(0000) GS:ffff880015fe7000(0000)
>>> knlGS:0000000000000000
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765011] CS:  e033 DS:
>>> 0000 ES: 0000 CR0: 000000008005003b
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765280] CR2:
>>> 00000000000002f0 CR3: 000000006ed46000 CR4: 0000000000002660
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765554] DR0:
>>> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765829] DR3:
>>> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766105] Process
>>> tapdisk2 (pid: 7781, threadinfo ffff88006ed48000, task
>>> ffff880079486270)
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766602] Stack:
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766868]
>>> ffff88007a81dee0 ffff880079486270 0000000000000000 0000000000000001
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767266]<0>
>>> 0000000000000001 ffffffff8121e0b9 0000000000000003 0000000000000000
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767967]<0>
>>> ffff880070f42000 0000bda50000bda7 ffff88005e6fc3e0 fffffffd00000000
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.768921] Call Trace:
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769193]
>>> [<ffffffff8121e0b9>] ? blktap_ring_ioctl+0x159/0x290
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769474]
>>> [<ffffffff8149c50a>] ? error_exit+0x2a/0x60
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769750]
>>> [<ffffffff810133dd>] ? retint_restore_args+0x5/0x6
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770030]
>>> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770308]
>>> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770592]
>>> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770874]
>>> [<ffffffff8100eca5>] ? xen_force_evtchn_callback+0x9/0xa
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771157]
>>> [<ffffffff8100f342>] ? check_events+0x12/0x20
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771435]
>>> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771715]
>>> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771993]
>>> [<ffffffff8100fcea>] ? xen_spin_lock_slow+0xb7/0xf8
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772275]
>>> [<ffffffff810c825a>] ? vfs_ioctl+0x55/0x6b
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772554]
>>> [<ffffffff810c8775>] ? do_vfs_ioctl+0x492/0x4e5
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772834]
>>> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773116]
>>> [<ffffffff810c8819>] ? sys_ioctl+0x51/0x70
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773392]
>>> [<ffffffff81012a02>] ? system_call_fastpath+0x16/0x1b
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773671] Code: e8 39 f6
>>> ff ff 49 8b 44 24 40 48 8b b8 f0 02 00 00 e8 e6 d6 27 00 4c 89 e7 41
>>> 8b 54 24 60 44 89 f6 e8 b9 59 f8 ff 49 8b 44 24 40<48>  8b b8 f0 02
>>> 00 00 e8 02 14 df ff 66 90 ff 14 25 78 8d 65 81
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777198] RIP
>>> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777524]
>>>  RSP<ffff88006ed49cf8>
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777796] CR2:
>>> 00000000000002f0
>>> Apr  8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.778067] ---[ end trace
>>> a71b80c14de09da1 ]---
>>>
>>> On 04/08/2011 12:44 PM, Teck Choon Giam wrote:
>>>>
>>>> On Fri, Apr 8, 2011 at 11:06 PM, Gerd Jakobovitsch
>>>> <gerd@xxxxxxxxxxx<mailto:gerd@xxxxxxxxxxx>>  wrote:
>>>>
>>>>    On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
>>>>>
>>>>>    On Wed, Mar 30, 2011 at 10:14 PM, Gerd
>>>>> Jakobovitsch<gerd@xxxxxxxxxxx>   <mailto:gerd@xxxxxxxxxxx>   wrote:
>>>>>>
>>>>>>    Hello all,
>>>>>>
>>>>>>    I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a
>>>>>> relative success, but the bug at mmu.c appeared once at a while. 
>>>>>> Therefore,
>>>>>> I'm looking for a more stable option.
>>>>>>    I compiled and ran the newly released xen 4.1, with kernel PVOPS
>>>>>> 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs
>>>>>> appeared at dmesg. After that, xl and xm commands do not longer respond:
>>>>>>
>>>>>>    [  145.749573]   alloc irq_desc for 2209 on node -1
>>>>>>    [  145.749581]   alloc kstat_irqs on node -1
>>>>>>    [  145.883515] block tda: sector-size: 512 capacity: 262144
>>>>>>    [  145.889952] general protection fault: 0000 [#1] SMP
>>>>>>    [  145.890109] last sysfs file: /sys/block/tda/removable
>>>>>>    [  145.890164] CPU 7
>>>>>>    [  145.890252] Modules linked in: bridge stp nfs fscache nfs_acl
>>>>>> auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev 
>>>>>> xt_multiport
>>>>>> lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>>>>>    [  145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1
>>>>>> PowerEdge M610
>>>>>>    [  145.891184] RIP: e030:[<ffffffff81281e79>]  [<ffffffff81281e79>]
>>>>>> blktap_device_end_request+0x4e/0x63
>>>>>>    [  145.891296] RSP: e02b:ffff880064061cd8  EFLAGS: 00010046
>>>>>>    [  145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX:
>>>>>> 0000000000000028
>>>>>>    [  145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
>>>>>> 0000000000000000
>>>>>>    [  145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09:
>>>>>> ffff88007da42948
>>>>>>    [  145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12:
>>>>>> ffff88007da427f8
>>>>>>    [  145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15:
>>>>>> ffff88007c75f3a8
>>>>>>    [  145.891651] FS:  00007ff33d9a4730(0000)
>>>>>> GS:ffff8800189e5000(0000) knlGS:0000000000000000
>>>>>>    [  145.891714] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>>    [  145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4:
>>>>>> 0000000000002660
>>>>>>    [  145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>>>> 0000000000000000
>>>>>>    [  145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>>>>>> 0000000000000400
>>>>>>    [  145.892171] Process tapdisk2 (pid: 5179, threadinfo
>>>>>> ffff880064060000, task ffff88007c272d60)
>>>>>>    [  145.892669] Stack:
>>>>>>    [  145.892934]  ffff88007c272d60 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>>    [  145.893334]<0>   ffff880064061e88 ffffffff812815ae
>>>>>> ffff880064061e58 ffffffff811d234f
>>>>>>    [  145.894035]<0>   ffff88007e9bbfc0 ffff88007c75f398
>>>>>> 00000001ffffffff 0000000000000000
>>>>>>    [  145.895015] Call Trace:
>>>>>>    [  145.895286]  [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
>>>>>>    [  145.895566]  [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>>>>>    [  145.895844]  [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>>>>>    [  145.896124]  [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>>>>>    [  145.896403]  [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
>>>>>>    [  145.896682]  [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
>>>>>>    [  145.896958]  [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>>>>>    [  145.897234]  [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
>>>>>>    [  145.897513]  [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
>>>>>>    [  145.897786]  [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
>>>>>>    [  145.898060]  [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
>>>>>>    [  145.898338]  [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>>>    [  145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03
>>>>>> 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 
>>>>>> 8b
>>>>>> 44 24 40<48>   8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 
>>>>>> c3
>>>>>>    [  145.902008] RIP  [<ffffffff81281e79>]
>>>>>> blktap_device_end_request+0x4e/0x63
>>>>>>    [  145.902321]  RSP<ffff880064061cd8>
>>>>>>    [  145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
>>>>>>    [  262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181,
>>>>>> ffff88007c75f520
>>>>>>    [  262.100965] Pid: 5181, comm: vol_id Tainted: G      D
>>>>>>  2.6.32.35 #1
>>>>>>    [  262.101232] Call Trace:
>>>>>>    [  262.101497]  [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
>>>>>>    [  262.101762]  [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>>>>>    [  262.102028]  [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>>>>>    [  262.102294]  [<ffffffff8101019f>] ?
>>>>>> xen_restore_fl_direct_end+0x0/0x1
>>>>>>    [  262.102563]  [<ffffffff811fa171>]
>>>>>> generic_make_request+0x258/0x2f4
>>>>>>    [  262.102832]  [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>>>>>    [  262.103099]  [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>>>>>    [  262.103366]  [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>>>>>    [  262.103631]  [<ffffffff8111448f>]
>>>>>> block_read_full_page+0x246/0x264
>>>>>>    [  262.103898]  [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>>>>>    [  262.104165]  [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>>>>>    [  262.104433]  [<ffffffff810ba73d>] ?
>>>>>> add_to_page_cache_locked+0xa0/0xca
>>>>>>    [  262.104702]  [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>>>>>    [  262.104972]  [<ffffffff810c1d36>]
>>>>>> __do_page_cache_readahead+0x144/0x177
>>>>>>    [  262.105240]  [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>>>>>    [  262.105507]  [<ffffffff810c20d7>]
>>>>>> page_cache_sync_readahead+0x38/0x3a
>>>>>>    [  262.105778]  [<ffffffff810bb833>]
>>>>>> generic_file_aio_read+0x24c/0x5c1
>>>>>>    [  262.106045]  [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>>>>>    [  262.106315]  [<ffffffff81068a02>] ?
>>>>>> autoremove_wake_function+0x0/0x38
>>>>>>    [  262.106584]  [<ffffffff811d701c>] ?
>>>>>> selinux_file_permission+0x5c/0x10e
>>>>>>    [  262.106854]  [<ffffffff811ce9c4>] ?
>>>>>> security_file_permission+0x11/0x13
>>>>>>    [  262.107120]  [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>>>>>    [  262.107385]  [<ffffffff810f2374>] sys_read+0x47/0x70
>>>>>>    [  262.107652]  [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>>>    [  262.107918] sending NMI to all CPUs:
>>>>>>    [  262.108189] BUG: unable to handle kernel paging request at
>>>>>> ffffffffff5fb310
>>>>>>    [  262.108526] IP: [<ffffffff8102c7d1>]
>>>>>> flat_send_IPI_mask+0x6a/0xc0
>>>>>>    [  262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
>>>>>>    [  262.109235] Oops: 0002 [#2] SMP
>>>>>>    [  262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
>>>>>>    [  262.109830] CPU 4
>>>>>>    [  262.110121] Modules linked in: bridge stp nfs fscache nfs_acl
>>>>>> auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev 
>>>>>> xt_multiport
>>>>>> lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>>>>>    [  262.111520] Pid: 5181, comm: vol_id Tainted: G      D
>>>>>>  2.6.32.35 #1 PowerEdge M610
>>>>>>    [  262.112008] RIP: e030:[<ffffffff8102c7d1>]  [<ffffffff8102c7d1>]
>>>>>> flat_send_IPI_mask+0x6a/0xc0
>>>>>>    [  262.112535] RSP: e02b:ffff88006778f968  EFLAGS: 00010086
>>>>>>    [  262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX:
>>>>>> 00000000000160a0
>>>>>>    [  262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI:
>>>>>> ffffffff81816020
>>>>>>    [  262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09:
>>>>>> 0000000000000004
>>>>>>    [  262.113605] R10: 0000000000000002 R11: 0000000000000004 R12:
>>>>>> 0000000000000002
>>>>>>    [  262.113877] R13: 0000000000000800 R14: 00000000000000ff R15:
>>>>>> 0000000000000000
>>>>>>    [  262.114149] FS:  00007fa78bcc5710(0063)
>>>>>> GS:ffff88001898e000(0000) knlGS:0000000000000000
>>>>>>    [  262.114636] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>>    [  262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4:
>>>>>> 0000000000002660
>>>>>>    [  262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>>>> 0000000000000000
>>>>>>    [  262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>>>>>> 0000000000000400
>>>>>>    [  262.115707] Process vol_id (pid: 5181, threadinfo
>>>>>> ffff88006778e000, task ffff88007db86250)
>>>>>>    [  262.116194] Stack:
>>>>>>    [  262.116451]  0000000000000000 0000000076e9ecd0 0000000000000000
>>>>>> 0000000076e9ecd0
>>>>>>    [  262.116825]<0>   ffff88006778f998 ffffffff8102c841
>>>>>> ffff88006778f9b8 ffffffff81029f0d
>>>>>>    [  262.117485]<0>   ffff88007c75f520 ffff88007c75f520
>>>>>> ffff88006778f9f8 ffffffff81219eb3
>>>>>>    [  262.118396] Call Trace:
>>>>>>    [  262.118657]  [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
>>>>>>    [  262.118925]  [<ffffffff81029f0d>]
>>>>>> arch_trigger_all_cpu_backtrace+0x45/0x66
>>>>>>    [  262.119195]  [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
>>>>>>    [  262.119463]  [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>>>>>    [  262.119730]  [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>>>>>    [  262.119996]  [<ffffffff8101019f>] ?
>>>>>> xen_restore_fl_direct_end+0x0/0x1
>>>>>>    [  262.120264]  [<ffffffff811fa171>]
>>>>>> generic_make_request+0x258/0x2f4
>>>>>>    [  262.120532]  [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>>>>>    [  262.120799]  [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>>>>>    [  262.121066]  [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>>>>>    [  262.121333]  [<ffffffff8111448f>]
>>>>>> block_read_full_page+0x246/0x264
>>>>>>    [  262.121602]  [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>>>>>    [  262.121870]  [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>>>>>    [  262.122137]  [<ffffffff810ba73d>] ?
>>>>>> add_to_page_cache_locked+0xa0/0xca
>>>>>>    [  262.127766]  [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>>>>>    [  262.128025]  [<ffffffff810c1d36>]
>>>>>> __do_page_cache_readahead+0x144/0x177
>>>>>>    [  262.128288]  [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>>>>>    [  262.128548]  [<ffffffff810c20d7>]
>>>>>> page_cache_sync_readahead+0x38/0x3a
>>>>>>    [  262.128810]  [<ffffffff810bb833>]
>>>>>> generic_file_aio_read+0x24c/0x5c1
>>>>>>    [  262.129070]  [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>>>>>    [  262.129329]  [<ffffffff81068a02>] ?
>>>>>> autoremove_wake_function+0x0/0x38
>>>>>>    [  262.129590]  [<ffffffff811d701c>] ?
>>>>>> selinux_file_permission+0x5c/0x10e
>>>>>>    [  262.129851]  [<ffffffff811ce9c4>] ?
>>>>>> security_file_permission+0x11/0x13
>>>>>>    [  262.130110]  [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>>>>>    [  262.130368]  [<ffffffff810f2374>] sys_read+0x47/0x70
>>>>>>    [  262.130624]  [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>>>    [  262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a
>>>>>> ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 
>>>>>> f0
>>>>>> c1 e0 18<89>   04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 
>>>>>> 06
>>>>>>    [  262.133866] RIP  [<ffffffff8102c7d1>]
>>>>>> flat_send_IPI_mask+0x6a/0xc0
>>>>>>    [  262.134164]  RSP<ffff88006778f968>
>>>>>>    [  262.134419] CR2: ffffffffff5fb310
>>>>>>    [  262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>>>>>>
>>>>>    Can you try to recompile your PVOPS kernel with
>>>>> CONFIG_DEBUG_PAGEALLOC=y?
>>>>>
>>>>>    You can read more about this BUG at
>>>>>
>>>>>  http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
>>>>>
>>>>>    I initially hit this BUG sometime Dec 2010... ...
>>>>>
>>>>>  http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
>>>>>
>>>>>    Thanks.
>>>>>
>>>>>    Kindest regards,
>>>>>    Giam Teck Choon
>>>>
>>>>    Sorry for the delayed answer. The problem I'm facing now is not
>>>>    related to the mmu bug - that one I still am seeing at systems
>>>>    with xen 4.0.2 / kernel 2.6.32.24. Newer kernels have new bugs,
>>>>    much more troublesome, since I can not run a single VM instance.
>>>>
>>>>    Adding DEBUG_PAGEALLOC, the main difference is that the system is
>>>>    rebooting shortly after trying to start up a VM:
>>>>
>>>>
>>>> Ok.  Sorry, didn't see your log message carefully previously.
>>>>
>>>>
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach'
>>>>    message (uuid = 0)
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach
>>>>    response' message (uuid = 0)
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open'
>>>>    message (uuid = 0)
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
>>>>    open('/storage5/linux-centos-5-64b-base-7253/hda')
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
>>>>    open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size:
>>>> pre sector_shift  [134217728]     post sector_shift [262144]
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
>>>>    /storage5/linux-centos-5-64b-base-rip/hda (1 users, state:
>>>>    0x00000001, type: 0)
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
>>>>    /storage5/linux-centos-5-64b-base/hda: 0
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open
>>>>    response' message (uuid = 0)
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
>>>>    sector-size: 512 capacity: 262144
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general
>>>>    protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs
>>>>    file: /sys/block/tda/removable
>>>>
>>>>
>>>> Just curious... ... what type of storage you are using for your VMs?
>>>>
>>>>
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules
>>>>    linked in: nfs fscache nfs_acl auth_rpcgss bridge stp ocfs2
>>>>    ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
>>>>    ocfs2_stackglue configfs arptable_filter arp_tables xt_esp ipt_ah
>>>>    xt_physdev xt_multiport dm_round_robin lockd sunrpc crc32c bonding
>>>>    iscsi_tcp libiscsi_tcp bnx2i libiscsi scsi_transport_iscsi cnic
>>>>    uio dm_multipath bnx2 megaraid_sas
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566,
>>>>    comm: tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
>>>>    e030:[<ffffffff8129fb89>]  [<ffffffff8129fb89>]
>>>>    blktap_device_end_request+0x4e/0x63
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
>>>>    e02b:ffff88006da5dcd8  EFLAGS: 00010046
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX:
>>>>    6b6b6b6b6b6b6b6b RBX: ffff88006566c000 RCX: 0000000000000000
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX:
>>>>    0000000000000000 RSI: 0000000000000000 RDI: ffff88006d8c7980
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP:
>>>>    ffff88006da5dcf8 R08: ffffffff817e66a0 R09: ffff88007d775790
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10:
>>>>    ffffffff810ccfe4 R11: ffff8800280d8f60 R12: ffff88006720e7f8
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13:
>>>>    ffff88006d8c77e0 R14: 0000000000000000 R15: ffff88006d8c77f0
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
>>>> 00007f86d75a3730(0000) GS:ffff8800280c7000(0000)
>>>>    knlGS:0000000000000000
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS:  e033 DS:
>>>>    0000 ES: 0000 CR0: 000000008005003b
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2:
>>>>    0000000045614ed8 CR3: 000000006d911000 CR4: 0000000000002660
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0:
>>>>    0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3:
>>>>    0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process
>>>>    tapdisk2 (pid: 15566, threadinfo ffff88006da5c000, task
>>>>    ffff88007d7750f0)
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807]
>>>> ffff88007d7750f0 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074]<0>
>>>>    ffff88006da5de88 ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574]<0>
>>>>    00000000ffffffff ffff88006d8c77e0 00000001816fab27 0000000000000000
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522]
>>>> [<ffffffff8129f2c0>] blktap_ring_ioctl+0x183/0x2d8
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767]
>>>> [<ffffffff8100fedd>] ? xen_force_evtchn_callback+0xd/0xf
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012]
>>>> [<ffffffff811ebe9b>] ? inode_has_perm+0xa1/0xb3
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260]
>>>> [<ffffffff8101064f>] ? xen_restore_fl_direct_end+0x0/0x1
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509]
>>>> [<ffffffff81081d84>] ? lock_release+0x1b8/0x1c3
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756]
>>>> [<ffffffff81233d2c>] ? _raw_spin_unlock+0xab/0xb2
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999]
>>>> [<ffffffff8155618d>] ? _spin_unlock+0x26/0x2a
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248]
>>>> [<ffffffff81134aa1>] ? aio_read_evt+0x87/0x13a
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493]
>>>> [<ffffffff81134b36>] ? aio_read_evt+0x11c/0x13a
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740]
>>>> [<ffffffff81233ecc>] ? _raw_spin_lock+0x77/0x12f
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986]
>>>> [<ffffffff811ec054>] ? file_has_perm+0xb4/0xc6
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237]
>>>> [<ffffffff81112444>] vfs_ioctl+0x5e/0x77
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480]
>>>> [<ffffffff81112959>] do_vfs_ioctl+0x484/0x4d5
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726]
>>>> [<ffffffff81112a01>] sys_ioctl+0x57/0x7a
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970]
>>>> [<ffffffff81013d42>] system_call_fastpath+0x16/0x1b
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61
>>>>    f4 ff ff 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b
>>>>    54 24 60 44 89 f6 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40<48>  8b
>>>>    b8 70 04 00 00 e8 1d 65 2b 00 5b 41 5c 41 5d 41 5e c9 c3
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
>>>> [<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577]  RSP
>>>>    <ffff88006da5dcd8>
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end
>>>>    trace da218b929afc63f7 ]---
>>>>    Apr  8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
>>>>    Apr  8 12:00:48 r2b16ch2x28p2 tap-ctl:
>>>>    tap-err:tap_ctl_read_message: failure reading message
>>>>    Apr  8 12:00:48 r2b16ch2x28p2 tap-ctl:
>>>>    tap-err:tap_ctl_send_and_receive: failed to receive 'unknown' message
>>>>
>>>>
>>>> Looks like it is related to blktap/blktap2 drivers related issue
>>>> to me... so you are right... ... this is different BUG from what I
>>>> encountered as the logs are different.  Sorry, didn't read your
>>>> log carefully before replying previously.  Have you try to use
>>>> normal LVM for your VM to reproduce the BUG as I guess you are
>>>> using different storage?
>>>>
>>>> Thanks.
>>>>
>>>> Kindest regards,
>>>> Giam Teck Choon
>>>
>>> --
>>>
>>> *Gerd Jakobovitsch
>>> Engenheiro de Produto **
>>> ---------------------------------------------------------* *
>>> **ALOG Data Centers do Brasil**
>>> **Excelência em Projetos de Hosting*
>>> Rua Dr. Miguel Couto, 58 -- 01008-010 -- São Paulo - SP
>>> Telefone: (11) 3524-4970 / (11) 7152-0815
>>> *http://www.alog.com.br*<http://www.alog.com.br/>
>>>
>>>
>>> *"Como estão nossos serviços? Clique aqui
>>> <https://www.surveymonkey.com/s/faleconosco>  e nos conte. Queremos
>>> escutar a sua opinião!"*
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
>
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.