[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen-users] Xen 4.3.1 / Linux 3.12 panic



On Wed, Nov 06, 2013 at 02:25:28PM +0100, Wouter de Geus wrote:
> * Ian Campbell <Ian.Campbell@xxxxxxxxxx> [2013-11-06 10:51:07 +0000]:
> 
> > > If this turns out to be stable I'll try again with cpufreq=dom0 to see if
> > > that's also stable. I'll report my findings if you care.
> > 
> > Please do.
> 
> With cpufreq=none I've been able to run through a windows 2008 installation
> and some kernel compiles without problems.  After that I rebooted with
> cpufreq=dom0, and within 5 minutes ran into the first oops again:

Is there a particular reason you had tried 'cpufreq'? Sorry if that
was answered earlier?
> 
> [  428.105061] BUG: unable to handle kernel paging request at ffffea0000dd8a48
> [  428.105103] IP: [<ffffffff8115c126>] unmap_single_vma+0x426/0x820
> [  428.105115] PGD 1281d6067 PUD 1281d5067 PMD 1281ce067 PTE 801000097bf53068
> [  428.105123] Oops: 0000 [#1] SMP 
> [  428.105127] Modules linked in:
> [  428.105133] CPU: 3 PID: 1786 Comm: sh Not tainted 3.12.0-Desman #32
> [  428.105138] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 3.0    
>     09/10/2012
> [  428.105142] task: ffff88011dbb1590 ti: ffff8800d5088000 task.ti: 
> ffff8800d5088000
> [  428.105147] RIP: e030:[<ffffffff8115c126>]  [<ffffffff8115c126>] 
> unmap_single_vma+0x426/0x820
> [  428.105154] RSP: e02b:ffff8800d5089d30  EFLAGS: 00010246
> [  428.105157] RAX: 80000008002db165 RBX: ffff8800d2ad0d60 RCX: 
> 0000000000dd8a40
> [  428.105161] RDX: 80000008002db165 RSI: 0000000001fac000 RDI: 
> 80000008002db165
> [  428.105165] RBP: ffffea0000dd8a40 R08: ffff8800d2b52cf0 R09: 
> 00000000fffffffa
> [  428.105169] R10: 0000000000000a6f R11: 00000063ad0a7abc R12: 
> 0000000001fe5000
> [  428.105173] R13: ffffc00000000fff R14: 0000000001fac000 R15: 
> ffff8800d5089e40
> [  428.105181] FS:  00002b839c48c600(0000) GS:ffff880122a60000(0000) 
> knlGS:0000000000000000
> [  428.105186] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  428.105215] CR2: ffffea0000dd8a48 CR3: 00000000021de000 CR4: 
> 0000000000040660
> [  428.105220] Stack:
> [  428.105222]  ffff8800d6961c00 0000000000000000 ffff8800d2b52cf0 
> 0000000000000000
> [  428.105229]  ffffea00034ab430 80000008002db165 ffff8800c331c078 
> 0000000001fe5000
> [  428.105236]  ffff880000000000 00003ffffffff000 ffff88011dbb1590 
> 0000000001fe4fff
> [  428.105242] Call Trace:
> [  428.105248]  [<ffffffff8115d4c1>] ? unmap_vmas+0x41/0x90
> [  428.105254]  [<ffffffff81165e1a>] ? exit_mmap+0x8a/0x150
> [  428.105261]  [<ffffffff810abc19>] ? mmput+0x49/0x100
> [  428.105267]  [<ffffffff810afb53>] ? do_exit+0x273/0xa30
> [  428.105273]  [<ffffffff810dc045>] ? vtime_account_user+0x45/0x60
> [  428.105278]  [<ffffffff810b10d4>] ? do_group_exit+0x34/0xa0
> [  428.105284]  [<ffffffff810b114b>] ? SyS_exit_group+0xb/0x10
> [  428.105290]  [<ffffffff81d4fd8f>] ? tracesys+0xe1/0xe6
> [  428.105294] Code: 48 8b 3c 24 4c 89 f6 48 89 da 66 66 66 90 66 66 90 41 80 
> 4f 18 01 48 85 ed 0f 84 7a ff ff ff 48 83 7c 24 18 00 0f 85 02 03 00 00 <f6> 
> 45 08 01 0f 84 70 01 00 00 48 89 ef ff 8c 24 98 00 00 00 e8 
> [  428.105347] RIP  [<ffffffff8115c126>] unmap_single_vma+0x426/0x820
> [  428.105353]  RSP <ffff8800d5089d30>
> [  428.105356] CR2: ffffea0000dd8a48
> [  428.105360] ---[ end trace 81935aa1c6524ae3 ]---
> 
> > I suspect it shouldn't be necessary to use command lines to override
> > these things, but I've no idea how to diagnose this further.
> 
> Removing the entire cpufreq part from my dom0 kernel might help :)
> But then again, if that's a problem I would like the hypervisor to detect
> and avoid this problem if that's possible.

So the cpufreq=dom0 is kind of an nops as the Linux kernel will disable
the native CPUfreq machinery. This is done b/c it does not make sense
for Linux dom0 to control the CPU freq when it has no idea of the
workloads (the hypervisor has it).

But with the 'cpufreq=dom0' you are getting faults.

So the other question is - does anything happen if you disable ACPI power
states in the BIOS?

> 
> > Once you have the findings if you could post a summary to xen-devel and
> > CC jbeulich@xxxxxxxx & insong.liu@xxxxxxxxx (cpufreq/power mgmt
> > maintainers) perhaps they can advise.
> 
> Summary:
> --------
> The issue: Xen 4.3.1 and my Linux 3.12 build (with cpufreq) panics (page
> requests, GPF, bad page state) usually within a few minutes.
> When Xen is booted with cpufreq=none the problem seems to disappear, with
> cpufreq=dom0 the problem is still there.
> The machine I run this on is a dual opteron 6212 with 64GB ECC RAM on a
> Supermicro H8DGi board.
> 
> Regards,
> 
> Wouter.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.