[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] Re: 2.6.37-rc1 mainline domU - BUG: unable to handle kernel paging request
- To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Bruce Edge <bruce.edge@xxxxxxxxx>, Jeremy Fitzhardinge <jeremy@xxxxxxxx>
- From: Boris Derzhavets <bderzhavets@xxxxxxxxx>
- Date: Fri, 19 Nov 2010 06:32:26 -0800 (PST)
- Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
- Delivery-date: Fri, 19 Nov 2010 06:33:32 -0800
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=L91/hSfmUo/7mQgBt9v8Qk1x2nw8BPzQnnjUqNDLzSnI0fCN8YoioUaMUAwQ64O5nSA6dN+yZtN6B3kFHbvxGmW/Ysh00mWI2GhaNqY7daUt46pmLO0GZJoYgzxhIa0qyBBEjmj5atkMlX7G6QEfdfnc7+L3s2qZZkeRkyHSG7Q=;
- List-id: Xen developer discussion <xen-devel.lists.xensource.com>
I've also noticed , that if i change file say under /mnt/nfs/fedora
1. ls runs fine 2. `ls -l` - generates page fault
But doesn't crash DomU, regardless of stack trace printed in particular terminal session.
I can close crashed terminal and open second one. # cd /mnt/nfs/fedora # ls -l will succeed until i make some new changes to file descriptors,eg. edit some file. Then crashes second terminal session and third should be opened to be able work with file descriptors ( `ls -l`). When number of page faults reaches some critical value ( more then 5 , but in general unpredictable) DomU crashes. So , i cannot any more open new terminal session. This is stable and obvious regression in 2.6.37-rc2 vs 2.6.36 as PV DomU kernel.
Boris
--- On Thu, 11/18/10, Boris Derzhavets
<bderzhavets@xxxxxxxxx> wrote:
From: Boris Derzhavets <bderzhavets@xxxxxxxxx> Subject: Re: [Xen-devel] Re: 2.6.37-rc1 mainline domU - BUG: unable to handle kernel paging request To: "Bruce Edge" <bruce.edge@xxxxxxxxx> Cc: "Jeremy Fitzhardinge" <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx> Date: Thursday, November 18, 2010, 12:05 PM
Bruce, You should be able to apply patches to mainline 2.6.37-rc2 cleanly. This patches are taken out from MY's kernel-2.6.37-rc2.git0.fc15.src.rpm. I already applied them on Ubuntu 10.10 to uncompressed mainline rc2.
I have also to notice, that # mount
IP-Dom0:/home/user1 /mnt/nfs # cd /mnt/nfs # ls -l crashes DomU immediately in text mode. In graphics mode it doesn't necessary happen every time. DomU might survive this "hack" and crashed one hour latter by another reason.
Boris.
--- On Thu, 11/18/10, Bruce Edge <bruce.edge@xxxxxxxxx> wrote:
From: Bruce Edge <bruce.edge@xxxxxxxxx> Subject: Re: [Xen-devel] Re: 2.6.37-rc1 mainline domU - BUG: unable to handle kernel paging
request To: "Boris Derzhavets" <bderzhavets@xxxxxxxxx> Cc: "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx>, "Jeremy Fitzhardinge" <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx Date: Thursday, November 18, 2010, 11:40 AM
On Thu, Nov 18, 2010 at 2:34 AM, Boris Derzhavets <bderzhavets@xxxxxxxxx> wrote:
Could you apply two attached patches on top of 2.6.37-rc2 and see whether it gives some improvement or no ( with active NFS client at DomU)
Boris
|
Hi Boris,
Are you using the mainline kernel or a pvops branch with these patches? Maybe I'm doing something wrong, but they don't apply cleanly with 2.6.37-rc2:
%> patch --dry-run <../patches.2.6.37/xen.next-2.6.37.patch
patching file pgtable.h Hunk #1 FAILED at 399. 1 out of 1 hunk FAILED -- saving rejects to file pgtable.h.rej patching file pgtable.c
Hunk #1 FAILED at 15. 1 out of 1 hunk FAILED -- saving rejects to file pgtable.c.rej patching file ttm_bo_vm.c Hunk #1 FAILED at 273. Hunk #2 FAILED at 288. 2 out of 2 hunks FAILED -- saving rejects to file ttm_bo_vm.c.rej
......
%> patch --dry-run <../patches.2.6.37/xen.pcifront.fixes.patch
patching file enlighten.c Hunk #1 FAILED at 1090. Hunk #2 FAILED at 1202. 2 out of 2 hunks FAILED -- saving rejects to file enlighten.c.rej
patching file setup.c Hunk #1 FAILED at 337. Hunk #2 FAILED at 356. 2 out of 2 hunks FAILED -- saving rejects to file setup.c.rej
Same result for linux-2.6.37-xen-next branch.
-Bruce
--- On Wed, 11/17/10, Bruce Edge <bruce.edge@xxxxxxxxx> wrote:
From: Bruce Edge <bruce.edge@xxxxxxxxx>
Subject: Re: [Xen-devel] Re: 2.6.37-rc1 mainline domU - BUG: unable to handle kernel paging request To: "Boris Derzhavets" < bderzhavets@xxxxxxxxx>
Cc: "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx>, "Jeremy Fitzhardinge" <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
Date: Wednesday, November 17, 2010, 4:28 PMOn Tue, Nov 16, 2010 at 1:49 PM, Boris Derzhavets <bderzhavets@xxxxxxxxx> wrote:
Yes, here we are
[ 186.975228] ------------[ cut here ]------------ [ 186.975245] kernel BUG at mm/mmap.c:2399!
[ 186.975254] invalid opcode: 0000 [#1] SMP [ 186.975269] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map [ 186.975284] CPU 0 [ 186.975290] Modules linked in: nfs fscache deflate zlib_deflate ctr camellia cast5 rmd160 crypto_null ccm serpent blowfish twofish_generic twofish_x86_64 twofish_common ecb xcbc cbc sha256_generic sha512_generic des_generic cryptd aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ipv6 uinput xen_netfront
microcode xen_blkfront [last unloaded: scsi_wait_scan] [ 186.975507] [ 186.975515] Pid: 1562, comm: ls Not tainted 2.6.37-0.1.rc1.git8.xendom0.fc14.x86_64 #1 / [ 186.975529] RIP: e030:[<ffffffff8110ada1>] [<ffffffff8110ada1>] exit_mmap+0x10c/0x119
[ 186.975550] RSP: e02b:ffff8800781bde18 EFLAGS: 00010202 [ 186.975560] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 186.975573] RDX: 00000000914a9149 RSI: 0000000000000001 RDI: ffffea00000c0280
[ 186.975585] RBP: ffff8800781bde48 R08: ffffea00000c0280 R09: 0000000000000001 [ 186.975598] R10: ffffffff8100750f R11: ffffea0000967778 R12: ffff880076c68b00 [ 186.975610] R13: ffff88007f83f1e0 R14: ffff880076c68b68 R15: 0000000000000001
[ 186.975625] FS: 00007f8e471d97c0(0000) GS:ffff88007f831000(0000) knlGS:0000000000000000 [ 186.975639] CS: e033 DS: 0000 ES:
0000 CR0: 000000008005003b [ 186.975650] CR2: 00007f8e464a9940 CR3: 0000000001a03000 CR4: 0000000000002660 [ 186.975663] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 186.976012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 186.976012] Process ls (pid: 1562, threadinfo ffff8800781bc000, task ffff8800788223e0) [ 186.976012] Stack: [ 186.976012] 000000000000006b ffff88007f83f1e0 ffff8800781bde38 ffff880076c68b00 [ 186.976012] ffff880076c68c40 ffff8800788229d0 ffff8800781bde68 ffffffff810505fc
[ 186.976012] ffff8800788223e0 ffff880076c68b00 ffff8800781bdeb8 ffffffff81056747 [ 186.976012] Call Trace: [ 186.976012] [<ffffffff810505fc>] mmput+0x65/0xd8 [ 186.976012] [<ffffffff81056747>] exit_mm+0x13e/0x14b
[ 186.976012] [<ffffffff81056976>]
do_exit+0x222/0x7c6 [ 186.976012] [<ffffffff8100750f>] ? xen_restore_fl_direct_end+0x0/0x1 [ 186.976012] [<ffffffff8107ea7c>] ? arch_local_irq_restore+0xb/0xd [ 186.976012] [<ffffffff814b3949>] ? lockdep_sys_exit_thunk+0x35/0x67
[ 186.976012] [<ffffffff810571b0>] do_group_exit+0x88/0xb6 [ 186.976012] [<ffffffff810571f5>] sys_exit_group+0x17/0x1b [ 186.976012] [<ffffffff8100acf2>] system_call_fastpath+0x16/0x1b [ 186.976012] Code: 8d 7d 18 e8 c3 8a 00 00 41 c7 45 08 00 00 00 00 48 89 df e8 0d e9 ff ff 48 85 c0 48 89 c3 75 f0 49 83 bc 24 98 01 00 00 00 74 02 <0f> 0b 48 83 c4 18 5b 41 5c 41 5d c9 c3 55 48 89 e5 41 54 53 48
[ 186.976012] RIP [<ffffffff8110ada1>] exit_mmap+0x10c/0x119 [ 186.976012] RSP <ffff8800781bde18> [ 186.976012] ---[ end trace c0f4eff4054a67e4
]--- [ 186.976012] Fixing recursive fault but reboot is needed!
Message from syslogd@fedora14 at Nov 17 00:47:40 ... kernel:[ 186.975228] ------------[ cut here ]------------
Message from syslogd@fedora14 at Nov 17 00:47:40 ...
kernel:[ 186.975254] invalid opcode: 0000 [#1] SMP
Message from syslogd@fedora14 at Nov 17 00:47:40 ... kernel:[ 186.975269] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
Message from syslogd@fedora14 at Nov 17 00:47:40 ... kernel:[ 186.976012] Stack:
Message from syslogd@fedora14 at Nov 17 00:47:40 ... kernel:[ 186.976012] Call Trace:
Message from syslogd@fedora14 at Nov 17 00:47:40 ...
kernel:[ 186.976012] Code: 8d 7d 18 e8 c3 8a 00 00 41 c7 45 08 00 00 00 00 48 89 df e8 0d e9 ff ff 48 85 c0 48 89 c3 75 f0 49 83 bc 24 98 01 00 00 00 74 02 <0f> 0b 48 83 c4
18 5b 41 5c 41 5d c9 c3 55 48 89 e5 41 54 53 48
On Tue, Nov 16, 2010 at 12:43:28PM -0800, Boris Derzhavets wrote: > > Huh. I .. what? I am confused. I thought we established that the issue
> > was not related to Xen PCI front? You also seem to uncomment the > > upstream.core.patches and the xen.pvhvm.patch -
why? > > I cannot uncomment upstream.core.patches and the xen.pvhvm.patch > it gives failed HUNKs
Uhh.. I am even more confused. > > > Ok, they are.. v2.6.37-rc2 which came out today has the fixes
> > I am pretty sure rc2 doesn't contain everything from xen.next-2.6.37.patch, > gntdev's stuff for sure. I've built 2.6.37-rc2 kernel rpms and loaded > kernel-2.6.27-rc2.git0.xendom0.x86_64 under Xen 4.0.1.
> Device /dev/xen/gntdev has not been created. I understand that it's > unrelated to DomU ( related to Dom0) , but once again with rc2 in DomU i cannot > get 3.2 GB copied over to DomU from NFS share at Dom0.
So what I think you are saying is that you keep on getting the bug in DomU? Is the stack-trace the same as in rc1?
|
I haven't had much time to look into the broken/working version issues here, but I did confirm a couple of points: 1) The 2.6.37-rc2 has the same problem still 2) This problem goes away of one is not using NFS.
Not staggeringly helpful I know, but it's one small data point. -Bruce
|
|
-----Inline Attachment Follows-----
|
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|