[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [REGRESSION] Linux 6.15.1 xen/dom0 domain_crash_sync called from entry.S



On 6/10/25 12:22 AM, Jürgen Groß wrote:
> On 10.06.25 00:43, Chuck Zmudzinski wrote:
>> Hi,
>> 
>> I am seeing the following regression between Linux 6.14.8 and 6.15.1.
>> 
>> Kernel version 6.14.8 boots fine but version 6.15.1 crashes and
>> reboots on Xen. I don't know if 6.14.9 or 6.14.10 is affected, or
>> if 6.15 or the 6.15 release candidates are affected because I did
>> not test them.
>> 
>> Also, Linux 6.15.1 boots fine on bare metal without Xen.
>> 
>> Hardware: Intel i5-14500 Raptor Lake CPU, and ASRock B760M PG motherboard 
>> and 32 GB RAM.
>> 
>> Xen version: 4.19.2 (mockbuild@xxxxxxxxxxxx) (gcc (GCC) 13.3.1 20240611 (Red 
>> Hat 13.3.1-2)) debug=n Sun Apr 13 15:24:29 PDT 2025
>> 
>> Xen Command line: placeholder dom0_mem=2G,max:2G conring_size=32k 
>> com1=9600,8n1,0x40c0,16,1:0.0 console=com1
>> 
>> Linux version 6.15.1-1.el9.elrepo.x86_64 
>> (mockbuild@5b7a5dab3b71429898b4f8474fab8fa0) (gcc (GCC) 11.5.0 20240719 (Red 
>> Hat 11.5.0-5), GNU ld version 2.35.2-63.el9) #1 SMP PREEMPT_DYNAMIC Wed Jun  
>> 4 16:42:58 EDT 2025
>> 
>> Linux Kernel Command line: placeholder root=/dev/mapper/systems-rootalma ro 
>> crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M 
>> resume=UUID=2ddc2e3b-8f7b-498b-a4e8-bb4d33a1e5a7 console=hvc0
>> 
>> The Linux 6.15.1 dom0 kernel causes Xen to crash and reboot, here are
>> the last messages on the serial console (includes messages from both
>> dom0 and Xen) before crash:
>> 
>> [    0.301573] Speculative Store Bypass: Mitigation: Speculative Store 
>> Bypass disabled via prctl
>> 
>> [    0.301577] Register File Data Sampling: Vulnerable: No microcode
>> 
>> [    0.301581] ITS: Mitigation: Aligned branch/return thunks
>> 
>> [    0.301594] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
>> registers'
>> 
>> [    0.301598] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
>> 
>> [    0.301602] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
>> 
>> [    0.301605] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
>> 
>> [    0.301609] x86/fpu: Enabled xstate features 0x7, context size is 832 
>> bytes, using 'compacted' format.
>> 
>> (XEN) Pagetable walk from ffffc9003ffffff8:
>> (XEN)  L4[0x192] = 0000000855bee067 0000000000060e56
>> (XEN)  L3[0x000] = 0000000855bed067 0000000000060e55
>> (XEN)  L2[0x1ff] = 0000000855bf0067 0000000000060e58
>> (XEN)  L1[0x1ff] = 8010000855bf2025 0000000000060e5a
>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d04036e5b0 
>> x86_64/entry.S#domain_crash_page_fault_6x8+0/0x4
>> (XEN) Domain 0 (vcpu#0) crashed on cpu#11:
>> (XEN) ----[ Xen-4.19.2  x86_64  debug=n  Not tainted ]----
>> (XEN) CPU:    11
>> (XEN) RIP:    e033:[<ffffffff810014fe>]
>> (XEN) RFLAGS: 0000000000010206   EM: 1   CONTEXT: pv guest (d0v0)
>> (XEN) rax: ffffffff81fb12d0   rbx: 000000000000029a   rcx: 000000000000000c
>> (XEN) rdx: 000000000000029a   rsi: ffffffff81000b99   rdi: ffffc900400000f0
>> (XEN) rbp: 000000000000014d   rsp: ffffc90040000000   r8:  0000000000000f9c
>> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
>> (XEN) r12: 000000000000000c   r13: ffffffff82771530   r14: ffffffff827724cc
>> (XEN) r15: ffffc900400000f0   cr0: 0000000080050033   cr4: 0000000000b526e0
>> (XEN) cr3: 000000086ae24000   cr2: ffffc9003ffffff8
>> (XEN) fsb: 0000000000000000   gsb: ffff88819ac55000   gss: 0000000000000000
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
>> (XEN) Guest stack trace from rsp=ffffc90040000000:
>> (XEN)   Stack empty.
>> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
>> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
>> 
>> I searched mailing lists but could not find a report similar to what I am
>> seeing here.
>> 
>> I don't know what to try except to git bisect, but I have not done that yet.
> 
> This is a known issue.
> 
> A patch series to fix that has been posted:
> 
> https://lore.kernel.org/lkml/20250603111446.2609381-1-rppt@xxxxxxxxxx/
> 
> 
> Juergen

Yes, that patch set (the original 5 patches) fixes this issue (I
tested it on top of 6.15.2 released yesterday).

There is a suggested sixth patch in the thread, and I tried that
also but it caused a kernel panic in Xen PV dom0.

Thanks,

Chuck Zmudzinski



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.