[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: 2.6.37-rc1 mainline domU - BUG: unable to handle kernel paging request


  • To: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>, Bruce Edge <bruce.edge@xxxxxxxxx>
  • From: Boris Derzhavets <bderzhavets@xxxxxxxxx>
  • Date: Mon, 15 Nov 2010 00:06:16 -0800 (PST)
  • Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
  • Delivery-date: Mon, 15 Nov 2010 00:07:25 -0800
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=EtCkyTn6cgPARmEIl52mdRtS+58eXocZWvRHlF2O2CvfEK2Zr7w+ysPFZNyYy/axbFCOVbItt6v0wreJSYFoAVNhefaL/cXaYcu23Y5D3u1Rqo5OwtjhwmgbfBYeP3WgVU/sVuyAw5sxqi5UlfwDYLOMZAlaqxR1uyfdtCBmrCg=;
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Stack trace on f14 when working with NFS mount

[  218.984818] ------------[ cut here ]------------
[  218.984834] kernel BUG at mm/mmap.c:2399!
[  218.984844] invalid opcode: 0000 [#1] SMP
[  218.984857] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[  218.984872] CPU 1
[  218.984879] Modules linked in: nfs fscache deflate zlib_deflate ctr camellia cast5 rmd160 crypto_null ccm serpent blowfish twofish_generic twofish_x86_64 twofish_common ecb xcbc cbc sha256_generic sha512_generic des_generic cryptd aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 uinput xen_netfront microcode xen_blkfront [last unloaded: scsi_wait_scan]
[  218.985011]
[  218.985011] Pid: 1566, comm: ls Not tainted 2.6.37-0.1.rc1.git8.xendom0.fc14.x86_64 #1 /
[  218.985011] RIP: e030:[<ffffffff8110ada1>]  [<ffffffff8110ada1>] exit_mmap+0x10c/0x119
[  218.985011] RSP: e02b:ffff8800774a9e18  EFLAGS: 00010202
[  218.985011] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0020000000000000
[  218.985011] RDX: 0000000000100004 RSI: ffff8800770ea1b8 RDI: ffffea0001a00230
[  218.985011] RBP: ffff8800774a9e48 R08: ffff88007d045108 R09: 000000000000005a
[  218.985011] R10: ffffffff8100750f R11: ffffea000182b7b0 R12: ffff880077dc6300
[  218.985011] R13: ffff88007fa1b1e0 R14: ffff880077dc6368 R15: 0000000000000001
[  218.985011] FS:  00007f4a38dd17c0(0000) GS:ffff88007fa0d000(0000) knlGS:0000000000000000
[  218.985011] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  218.985011] CR2: 00007f4a380a1940 CR3: 0000000001a03000 CR4: 0000000000002660
[  218.985011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  218.985011] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  218.985011] Process ls (pid: 1566, threadinfo ffff8800774a8000, task ffff880003ca47c0)
[  218.985011] Stack:
[  218.985011]  000000000000006b ffff88007fa1b1e0 ffff8800774a9e38 ffff880077dc6300
[  218.985011]  ffff880077dc6440 ffff880003ca4db0 ffff8800774a9e68 ffffffff810505fc
[  218.985011]  ffff880003ca47c0 ffff880077dc6300 ffff8800774a9eb8 ffffffff81056747
[  218.985011] Call Trace:
[  218.985011]  [<ffffffff810505fc>] mmput+0x65/0xd8
[  218.985011]  [<ffffffff81056747>] exit_mm+0x13e/0x14b
[  218.985011]  [<ffffffff81056976>] do_exit+0x222/0x7c6
[  218.985011]  [<ffffffff8100750f>] ? xen_restore_fl_direct_end+0x0/0x1
[  218.985011]  [<ffffffff8107ea7c>] ? arch_local_irq_restore+0xb/0xd
[  218.985011]  [<ffffffff814b3949>] ? lockdep_sys_exit_thunk+0x35/0x67
[  218.985011]  [<ffffffff810571b0>] do_group_exit+0x88/0xb6
[  218.985011]  [<ffffffff810571f5>] sys_exit_group+0x17/0x1b
[  218.985011]  [<ffffffff8100acf2>] system_call_fastpath+0x16/0x1b
[  218.985011] Code: 8d 7d 18 e8 c3 8a 00 00 41 c7 45 08 00 00 00 00 48 89 df e8 0d e9 ff ff 48 85 c0 48 89 c3 75 f0 49 83 bc 24 98 01 00 00 00 74 02 <0f> 0b 48 83 c4 18 5b 41 5c 41 5d c9 c3 55 48 89 e5 41 54 53 48
[  218.985011] RIP  [<ffffffff8110ada1>] exit_mmap+0x10c/0x119
[  218.985011]  RSP <ffff8800774a9e18>
[  218.985011] ---[ end trace 99b09fa378e85262 ]---
[  218.985011] Fixing recursive fault but reboot is needed!

Message from syslogd@fedora14 at Nov 15 11:03:20 ...
 kernel:[  218.984818] ------------[ cut here ]------------

Message from syslogd@fedora14 at Nov 15 11:03:20 ...
 kernel:[  218.984844] invalid opcode: 0000 [#1] SMP

Message from syslogd@fedora14 at Nov 15 11:03:20 ...
 kernel:[  218.984857] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map

Message from syslogd@fedora14 at Nov 15 11:03:20 ...
 kernel:[  218.985011] Stack:

Message from syslogd@fedora14 at Nov 15 11:03:20 ...
 kernel:[  218.985011] Call Trace:

Message from syslogd@fedora14 at Nov 15 11:03:20 ...
 kernel:[  218.985011] Code: 8d 7d 18 e8 c3 8a 00 00 41 c7 45 08 00 00 00 00 48 89 df e8 0d e9 ff ff 48 85 c0 48 89 c3 75 f0 49 83 bc 24 98 01 00 00 00 74 02 <0f> 0b 48 83 c4 18 5b 41 5c 41 5d c9 c3 55 48 89 e5 41 54 53 48
[  259.093423] BUG: unable to handle kernel paging request at ffff880077d352a8
[  259.093441] IP: [<ffffffff81037648>] ptep_set_access_flags+0x2b/0x51
[  259.093456] PGD 1a04067 PUD 59c9067 PMD 5b88067 PTE 8010000077d35065
[  259.093472] Oops: 0003 [#2] SMP
[  259.093481] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[  259.093493] CPU 1
[  259.093498] Modules linked in: nfs fscache deflate zlib_deflate ctr camellia cast5 rmd160 crypto_null ccm serpent blowfish twofish_generic twofish_x86_64 twofish_common ecb xcbc cbc sha256_generic sha512_generic des_generic cryptd aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 uinput xen_netfront microcode xen_blkfront [last unloaded: scsi_wait_scan]
[  259.093652]
[  259.093658] Pid: 1567, comm: abrtd Tainted: G      D     2.6.37-0.1.rc1.git8.xendom0.fc14.x86_64 #1 /
[  259.093669] RIP: e030:[<ffffffff81037648>]  [<ffffffff81037648>] ptep_set_access_flags+0x2b/0x51
[  259.093683] RSP: e02b:ffff8800770e7bf8  EFLAGS: 00010202
[  259.093690] RAX: 80000001bf75f101 RBX: ffff880077521400 RCX: 80000001bf75f167
[  259.093699] RDX: ffff880077d352a8 RSI: 00007fb9b9255ad0 RDI: ffff880077521400
[  259.093708] RBP: ffff8800770e7c28 R08: 0000000000000001 R09: 1580000000000000
[  259.093717] R10: ffffffff8100750f R11: ffff880077dc5800 R12: 00007fb9b9255ad0
[  259.093726] R13: 0000000000000001 R14: ffff880003f2f9f8 R15: ffff880077d352a8
[  259.093737] FS:  00007fb9b9255800(0000) GS:ffff88007fa0d000(0000) knlGS:0000000000000000
[  259.093747] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  259.093755] CR2: ffff880077d352a8 CR3: 00000000043c8000 CR4: 0000000000002660
[  259.093764] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  259.093773] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  259.093783] Process abrtd (pid: 1567, threadinfo ffff8800770e6000, task ffff880003d2c7c0)
[  259.093800] Stack:
[  259.093807]  ffffea00018382b0 0000000000000000 0000000000000034 0000000000000000
[  259.093829]  ffff880077521400 0000000000000000 ffff8800770e7cb8 ffffffff81104a57
[  259.093851]  ffffffff810050a3 ffffffff00000001 ffff880004307e48 ffff8800770e7ca8
[  259.093873] Call Trace:
[  259.093885]  [<ffffffff81104a57>] do_wp_page+0x241/0x53d
[  259.093899]  [<ffffffff810050a3>] ? xen_pte_val+0x6a/0x6c
[  259.093911]  [<ffffffff81004635>] ? __raw_callee_save_xen_pte_val+0x11/0x1e
[  259.093926]  [<ffffffff8100750f>] ? xen_restore_fl_direct_end+0x0/0x1
[  259.093941]  [<ffffffff81106491>] ? handle_mm_fault+0x6ea/0x7af
[  259.093954]  [<ffffffff811064e2>] handle_mm_fault+0x73b/0x7af
[  259.093969]  [<ffffffff81073597>] ? down_read_trylock+0x44/0x4e
[  259.093983]  [<ffffffff814b7aa4>] do_page_fault+0x363/0x385
[  259.093996]  [<ffffffff81006f59>] ? xen_force_evtchn_callback+0xd/0xf
[  259.094011]  [<ffffffff81007522>] ? check_events+0x12/0x20
[  259.094025]  [<ffffffff814b3912>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[  259.094039]  [<ffffffff814b4ad5>] page_fault+0x25/0x30
[  259.094053]  [<ffffffff8125403d>] ? __put_user_4+0x1d/0x30
[  259.094066]  [<ffffffff8104bf66>] ? schedule_tail+0x61/0x65
[  259.094079]  [<ffffffff8100abf3>] ret_from_fork+0x13/0x80
[  259.094089] Code: 55 48 89 e5 41 55 41 54 53 48 83 ec 18 0f 1f 44 00 00 48 39 0a 48 89 fb 49 89 f4 0f 95 c0 45 85 c0 44 0f b6 e8 74 1c 84 c0 74 18 <48> 89 0a 48 8b 3f 0f 1f 80 00 00 00 00 4c 89 e6 48 89 df e8 bb
[  259.094149] RIP  [<ffffffff81037648>] ptep_set_access_flags+0x2b/0x51
[  259.094149]  RSP <ffff8800770e7bf8>
[  259.094149] CR2: ffff880077d352a8
[  259.094149] ---[ end trace 99b09fa378e85263 ]---

Message from syslogd@fedora14 at Nov 15 11:04:00 ...
 kernel:[  259.093472] Oops: 0003 [#2] SMP

Message from syslogd@fedora14 at Nov 15 11:04:00 ...
 kernel:[  259.093481] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map

Message from syslogd@fedora14 at Nov 15 11:04:00 ...
 kernel:[  259.093800] Stack:

Message from syslogd@fedora14 at Nov 15 11:04:00 ...
 kernel:[  259.093873] Call Trace:

Message from syslogd@fedora14 at Nov 15 11:04:00 ...
 kernel:[  259.094089] Code: 55 48 89 e5 41 55 41 54 53 48 83 ec 18 0f 1f 44 00 00 48 39 0a 48 89 fb 49 89 f4 0f 95 c0 45 85 c0 44 0f b6 e8 74 1c 84 c0 74 18 <48> 89 0a 48 8b 3f 0f 1f 80 00 00 00 00 4c 89 e6 48 89 df e8 bb

Message from syslogd@fedora14 at Nov 15 11:04:00 ...
 kernel:[  259.094149] CR2: ffff880077d352a8


--- On Sun, 11/14/10, Bruce Edge <bruce.edge@xxxxxxxxx> wrote:

From: Bruce Edge <bruce.edge@xxxxxxxxx>
Subject: Re: [Xen-devel] Re: 2.6.37-rc1 mainline domU - BUG: unable to handle kernel paging request
To: "Sander Eikelenboom" <linux@xxxxxxxxxxxxxx>
Cc: "Boris Derzhavets" <bderzhavets@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, "Jeremy Fitzhardinge" <jeremy@xxxxxxxx>, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx>
Date: Sunday, November 14, 2010, 4:35 PM

On Sun, Nov 14, 2010 at 8:56 AM, Sander Eikelenboom
<linux@xxxxxxxxxxxxxx> wrote:
> Hmmm have you tried do do a lot of I/O with something else as NFS ?
> That would perhaps pinpoint it to NFS doing something not completely compatible with Xen.

I have my own suspicions regarding the more recent NFS clients. Post
10.04 Ubuntu variants do not tolerate large NFS transfers even without
Xen. Any more than a few 100 Megs and you start getting 'task blocked
for more than 120 sec..." messages along with stack traces showing
part of the NFS call stack.
Perhaps a parallel effort could be to test the 2.6.37-rc1 kernel with
something other than NFS for remote filesystems. I'll see if I get the
same problems with glusterfs.

-Bruce

>
> I'm not using NFS (I still use file: based guests, and i use glusterfs (fuse based userspace cluster fs) to share diskspace to domU's via ethernet).
> I tried NFS in the past, but had some troubles setting it up, and even more problems with disconnects.
>
> I haven't seen any "unable to handle page request" problems with my mix of guest kernels, which includes some 2.6.37-rc1 kernels.
>
> --
>
> Sander
>
>
>
>
>
> Sunday, November 14, 2010, 5:37:59 PM, you wrote:
>
>> I've tested F14 DomU (kernel vmlinuz-2.6.37-0.1.rc1.git8.xendom0.fc14.x86_64) as NFS client and Xen 4.0.1 F14 Dom0 (kernel vmlinuz-2.6.32.25-172.xendom0.fc14.x86_64) as NFS server . Copied 700 MB ISO images from NFS folder at Dom0 to DomU and scp'ed them back to Dom0. During about 30 - 40 min DomU ran pretty stable , regardless kernel crash as "unable to handle page request" was reported once by F14 DomU, but it didn't actually crash DomU. Same excersises with replacement F14 by Ubuntu 10.04 Server results DomU crash in about several minutes. Dom0's instances dual boot on same development box ( Q9500,ASUS P5Q3,8GB)
>
>> Boris.
>
>> --- On Fri, 11/12/10, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:
>
>> From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>> Subject: Re: [Xen-devel] Re: 2.6.37-rc1 mainline domU - BUG: unable to handle kernel paging request
>> To: "Sander Eikelenboom" <linux@xxxxxxxxxxxxxx>
>> Cc: "Boris Derzhavets" <bderzhavets@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, "Bruce Edge" <bruce.edge@xxxxxxxxx>, "Jeremy Fitzhardinge" <jeremy@xxxxxxxx>
>> Date: Friday, November 12, 2010, 12:01 PM
>
>> On Fri, Nov 12, 2010 at 05:27:43PM +0100, Sander Eikelenboom wrote:
>>> Hi Bruce,
>>>
>>> Perhaps handpick some kernels before and after the pulls of the xen patches (pv-on-hvm etc) to begin with ?
>>> When you let git choose, especially with rc-1 kernels, you will end up with kernels in between patch series, resulting in panics.
>
>> Well, just the bare-bone boot of PV guests with nothing fancy ought to work.
>
>> But that is the theory and ..
>>> > The git bisecting is slow going. I've never tried that before and I'm a git
>>> > rookie.
>>> > I picked 2.6.36 - 2.6.37-rc1 as the bisect range and my first 2 bisects all
>>> > panic at boot so I'm obviously doing something wrong.
>>> > I'll RTFM a bit more and keep at it.
>
>> .. as Bruce experiences this is not the case. Hmm..
>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>
>
>
>>
>
>
>
> --
> Best regards,
>  Sander                            mailto:linux@xxxxxxxxxxxxxx
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.