[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] dom0 show call trace and failed to boot on HSW-EX platform

To: "Li, Liang Z" <liang.z.li@xxxxxxxxx>
From: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
Date: Tue, 2 Feb 2016 20:56:16 +0100
Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, David Vrabel <david.vrabel@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 02 Feb 2016 19:56:45 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, Feb 02, 2016 at 01:15:13PM +0000, Li, Liang Z wrote:
> > >> We found dom0 will crash when booing on HSW-EX server, the dom0
> > >> kernel version is v4.4. By debugging I found the your patch '
> > >> x86/xen: discard RAM regions above the maximum reservation' , which
> > the commit ID is : f5775e0b6116b7e2425ccf535243b21 caused the regression.
> > The debug message is listed below:
> > >>
> > ==========================================================
> > >>  (XEN) mm.c:884:d0v14 pg_owner 0 l1e_owner 0, but real_pg_owner -1
> > >>  (XEN) mm.c:955:d0v14 Error getting mfn 1080000 (pfn
> > >> ffffffffffffffff) from L1
> > >>  (XEN) mm.c:1269:d0v14 Failure in alloc_l1_table: entry 0
> > >>  (XEN) mm.c:2175:d0v14 Error while validating mfn 188d903 (pfn
> > >> 17a7cc) for type
> > >>  (XEN) mm.c:3101:d0v14 Error -16 while pinning mfn 188d903
> > >>  [   33.768792] ------------[ cut here ]------------
> > >> WARNING: CPU: 14 PID: 1 at arch/x86/xen/multicalls.c:129 xen_mc_
> > >>  [   33.783809] Modules linked in:
> > >>  [   33.787304] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #1
> > >>  [   33.793991] Hardware name: Intel Corporation BRICKLAND/BRICKLAND,
> > BIOS
> > >>  [   33.805624]  0000000000000081 ffff88017d2537c8 ffffffff812ff954
> > 000000000000[24;80H[24;80H[24;80H[24;80H
> > >>  [   33.813961]  0000000000000000 0000000000000081 0000000000000000
> > ffff88017d25[24;80H[24;80H[24;80H[24;80H
> > >>  [   33.822300]  ffffffff810ca120 ffffffff81cb7f00 ffff8801879ca280
> > 000000000000[24;80H[24;80H[24;80H[24;80H
> > >>  [   33.830639] Call Trace:
> > >>  [   33.833457]  [<ffffffff812ff954>] dump_stack+0x48/0x64
> > >>  [   33.839277]  [<ffffffff810ca120>] warn_slowpath_common+0x90/0xd0
> > >>  [   33.846058]  [<ffffffff810ca175>] warn_slowpath_null+0x15/0x20
> > >>  [   33.852659]  [<ffffffff81060133>] xen_mc_flush+0x1c3/0x1d0
> > >>  [   33.858858]  [<ffffffff8106449f>] xen_alloc_pte+0x20f/0x300
> > >>  [   33.865158]  [<ffffffff810beef5>] ? update_page_count+0x45/0x60
> > >>  [   33.871855]  [<ffffffff817a1194>] ? phys_pte_init+0x170/0x183
> > >>  [   33.878345]  [<ffffffff817a148d>] phys_pmd_init+0x2e6/0x389
> > >>  [   33.884649]  [<ffffffff817a17dd>] phys_pud_init+0x2ad/0x3dc
> > >>  [   33.890954]  [<ffffffff817a290d>]
> > kernel_physical_mapping_init+0xec/0x211
> > >>  [   33.898613]  [<ffffffff8179df8d>] init_memory_mapping+0x17d/0x2f0
> > >>  [   33.905496]  [<ffffffff81104f11>] ?
> > __raw_callee_save___pv_queued_spin_unloc[24;80H[24;80H[24;80H[2
> > 4;80H[24;80H[24;80H[24;80H[24;80H[24;80H[24;80H[24;80H
> > >>  [   33.914516]  [<ffffffff813643f7>] ?
> > acpi_os_signal_semaphore+0x2e/0x32
> > >>  [   33.921889]  [<ffffffff810ba7b8>] arch_add_memory+0x48/0xf0
> > >>  [   33.928186]  [<ffffffff8179eb80>] add_memory_resource+0x80/0x110
> > >>  [   33.934967]  [<ffffffff8179ec8d>] add_memory+0x7d/0xc0
> > >>  [   33.940787]  [<ffffffff81399538>]
> > acpi_memory_device_add+0x14f/0x237
> >
> > We shouldn't be adding memory based on the ACPI tables.
> >
> > David
>
> To solve this issue, what's your suggestion, simply revert? Or with a 
> workaround?

Please do not blindly revert anything.

It looks strange. Does this machine support memory hotplug? Anyway, bare
metal memory hotplug stuff should not be used on dom0. I think that you
should investigate why kernel attempts to hotplug memory at this stage.
I suppose that it should not. If we know why then we will think how
to fix it.

Daniel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- [Xen-devel] dom0 show call trace and failed to boot on HSW-EX platform
  - From: Li, Liang Z
- Re: [Xen-devel] dom0 show call trace and failed to boot on HSW-EX platform
  - From: Andrew Cooper
- Re: [Xen-devel] dom0 show call trace and failed to boot on HSW-EX platform
  - From: David Vrabel
- Re: [Xen-devel] dom0 show call trace and failed to boot on HSW-EX platform
  - From: Li, Liang Z

Prev by Date: Re: [Xen-devel] [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks
Next by Date: Re: [Xen-devel] [Qemu-devel] [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks
Previous by thread: Re: [Xen-devel] dom0 show call trace and failed to boot on HSW-EX platform
Next by thread: Re: [Xen-devel] dom0 show call trace and failed to boot on HSW-EX platform
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.