[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] d0v0 Unhandled general protection fault with 4.9.x on brand new hardware



Hi everyone,

I need to address an issue that prevents me from running Xen on new hardware. It's an HPE DL360 Gen10 with double Xeon Silver 4108 CPU and 256GB ECC DDR4 RDIMM. It happens with both CentOS6 and CentOS7.

When I try to boot with Xen kernel 4.9.58-29.el6, I get the following error at boot time (I can read it from the serial console):

(XEN) Brought up 32 CPUs
(XEN) ACPI sleep modes: S3
(XEN) VPMU: disabled
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) Dom0 has maximum 1240 PIRQs
(XEN) NX (Execute Disable) protection active
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x26a0000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000003fdc000000->0000003fe0000000 (1012299 pages to be allocated)
(XEN)  Init. ramdisk: 000000403b04b000->000000403fdff200
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff826a0000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: 0000008000000000->0000008000800000
(XEN)  Start info:    ffffffff826a0000->ffffffff826a04b4
(XEN)  Page tables:   ffffffff826a1000->ffffffff826b8000
(XEN)  Boot stack:    ffffffff826b8000->ffffffff826b9000
(XEN)  TOTAL:         ffffffff80000000->ffffffff82800000
(XEN)  ENTRY ADDRESS: ffffffff821a9180
(XEN) Dom0 has maximum 32 VCPUs
(XEN) Scrubbing Free RAM on 2 nodes using 16 CPUs
(XEN) .................................................................................................................................done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) Freed 300kB init memory.
mapping kernel into physical memory
about to get started...
(XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
(XEN) domain_crash_sync called from entry.S: fault at ffff82d08022f983 create_bounce_frame+0x12b/0x13a
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.6.6-3.el6  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff8103cf38>]
(XEN) RFLAGS: 0000000000000246   EM: 1   CONTEXT: pv guest (d0v0)
(XEN) rax: 00000000000002ff   rbx: ffffffff8217a1a0   rcx: 0000000000000000
(XEN) rdx: 0000000000000000   rsi: 00000000000002ff   rdi: 0000000000042660
(XEN) rbp: ffffffff82003dc8   rsp: ffffffff82003d80   r8:  ffffffff82003e0c
(XEN) r9:  ffffffff82003e08   r10: 00000000ffffffff   r11: 00000000ffffffff
(XEN) r12: ffffffff82003e04   r13: ffffffff82003e00   r14: ffffffff82003dfc
(XEN) r15: ffffffff82003df8   cr0: 0000000080050033   cr4: 00000000003526e0
(XEN) cr3: 0000003fde007000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff82003d80:
(XEN)    0000000000000000 00000000ffffffff 0000000000000000 ffffffff8103cf38
(XEN)    000000010000e030 0000000000010046 ffffffff82003dc8 000000000000e02b
(XEN)    ffffffff8103cf28 ffffffff82003e48 ffffffff821bbd3e ffffffff82199890
(XEN)    ffffffff8219a090 ffffffff82199890 ffffffff8219a090 ffffffff82003e38
(XEN)    0000000000100800 00000a8800000000 000002ff00000240 ffffffff8217a1a0
(XEN)    ffffffff8217a1a0 ffffffff82673000 ffffffff82003f20 0000000000000000
(XEN)    0000000000000000 ffffffff82003e78 ffffffff821bb684 0000000001000000
(XEN)    0000037f82673000 ffffffff82003f20 0000000001000000 ffffffff82003e88
(XEN)    ffffffff821bc371 ffffffff82003e98 ffffffff821bc3a7 ffffffff82003ef8
(XEN)    ffffffff821b73e7 ffffffff00000010 ffffffff82003f08 ffffffff82003ec8
(XEN)    ffffffff82003e88 ffffffff8114b100 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 ffffffff82003f28
(XEN)    ffffffff821aa0c6 0000000000000000 0000000000000000 b013b3f5b0133a3e
(XEN)    ffffffff821a97ac ffffffff82003f38 ffffffff821a9386 ffffffff82003ff8
(XEN)    ffffffff821b0dc6 0300000100000032 0000000000000005 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 ffd83a831fc9cbf5
(XEN)    0005065400100800 0000000000000001 0000000000000000 0000000000000000
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.

So far I tried (to no avail):
- installing older Xen kernel version (4.9.39-29.el6 and 4.9.25-27.el6)
- installing older Xen version (4.6.3-15.el6) instead of the current one (4.6.6-3.el6)
- disabling Hypertreading
- disabling NUMA
- changing several dom0_mem=,max: configurations (from 512M up to to 20G)
- changing several BIOS options including power profiles etc.

I read that someone else with a similar problem solved by setting some parameters in the kernel command line, but I have no clue.

Can you help me?

Thanks in advance!


-- Francesco
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.