[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [REGRESSION] Linux 6.15.1 xen/dom0 domain_crash_sync called from entry.S



On 6/11/25 5:34 PM, Chuck Zmudzinski wrote:
> On 6/10/25 12:22 AM, Jürgen Groß wrote:
>> On 10.06.25 00:43, Chuck Zmudzinski wrote:
>>> Hi,
>>> 
>>> I am seeing the following regression between Linux 6.14.8 and 6.15.1.
>>> 
>>> Kernel version 6.14.8 boots fine but version 6.15.1 crashes and
>>> reboots on Xen. I don't know if 6.14.9 or 6.14.10 is affected, or
>>> if 6.15 or the 6.15 release candidates are affected because I did
>>> not test them.
>>> 
>>> Also, Linux 6.15.1 boots fine on bare metal without Xen.
>>> 
>>> Hardware: Intel i5-14500 Raptor Lake CPU, and ASRock B760M PG motherboard 
>>> and 32 GB RAM.
>>> 
>>> Xen version: 4.19.2 (mockbuild@xxxxxxxxxxxx) (gcc (GCC) 13.3.1 20240611 
>>> (Red Hat 13.3.1-2)) debug=n Sun Apr 13 15:24:29 PDT 2025
>>> 
>>> Xen Command line: placeholder dom0_mem=2G,max:2G conring_size=32k 
>>> com1=9600,8n1,0x40c0,16,1:0.0 console=com1
>>> 
>>> Linux version 6.15.1-1.el9.elrepo.x86_64 
>>> (mockbuild@5b7a5dab3b71429898b4f8474fab8fa0) (gcc (GCC) 11.5.0 20240719 
>>> (Red Hat 11.5.0-5), GNU ld version 2.35.2-63.el9) #1 SMP PREEMPT_DYNAMIC 
>>> Wed Jun  4 16:42:58 EDT 2025
>>> 
>>> Linux Kernel Command line: placeholder root=/dev/mapper/systems-rootalma ro 
>>> crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M 
>>> resume=UUID=2ddc2e3b-8f7b-498b-a4e8-bb4d33a1e5a7 console=hvc0
>>> 
>>> The Linux 6.15.1 dom0 kernel causes Xen to crash and reboot, here are
>>> the last messages on the serial console (includes messages from both
>>> dom0 and Xen) before crash:
>>> 
>>> [    0.301573] Speculative Store Bypass: Mitigation: Speculative Store 
>>> Bypass disabled via prctl
>>> 
>>> [    0.301577] Register File Data Sampling: Vulnerable: No microcode
>>> 
>>> [    0.301581] ITS: Mitigation: Aligned branch/return thunks
>>> 
>>> [    0.301594] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
>>> registers'
>>> 
>>> [    0.301598] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
>>> 
>>> [    0.301602] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
>>> 
>>> [    0.301605] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
>>> 
>>> [    0.301609] x86/fpu: Enabled xstate features 0x7, context size is 832 
>>> bytes, using 'compacted' format.
>>> 
>>> (XEN) Pagetable walk from ffffc9003ffffff8:
>>> (XEN)  L4[0x192] = 0000000855bee067 0000000000060e56
>>> (XEN)  L3[0x000] = 0000000855bed067 0000000000060e55
>>> (XEN)  L2[0x1ff] = 0000000855bf0067 0000000000060e58
>>> (XEN)  L1[0x1ff] = 8010000855bf2025 0000000000060e5a
>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d04036e5b0 
>>> x86_64/entry.S#domain_crash_page_fault_6x8+0/0x4
>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#11:
>>> (XEN) ----[ Xen-4.19.2  x86_64  debug=n  Not tainted ]----
>>> (XEN) CPU:    11
>>> (XEN) RIP:    e033:[<ffffffff810014fe>]
>>> (XEN) RFLAGS: 0000000000010206   EM: 1   CONTEXT: pv guest (d0v0)
>>> (XEN) rax: ffffffff81fb12d0   rbx: 000000000000029a   rcx: 000000000000000c
>>> (XEN) rdx: 000000000000029a   rsi: ffffffff81000b99   rdi: ffffc900400000f0
>>> (XEN) rbp: 000000000000014d   rsp: ffffc90040000000   r8:  0000000000000f9c
>>> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
>>> (XEN) r12: 000000000000000c   r13: ffffffff82771530   r14: ffffffff827724cc
>>> (XEN) r15: ffffc900400000f0   cr0: 0000000080050033   cr4: 0000000000b526e0
>>> (XEN) cr3: 000000086ae24000   cr2: ffffc9003ffffff8
>>> (XEN) fsb: 0000000000000000   gsb: ffff88819ac55000   gss: 0000000000000000
>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
>>> (XEN) Guest stack trace from rsp=ffffc90040000000:
>>> (XEN)   Stack empty.
>>> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
>>> 
>>> I searched mailing lists but could not find a report similar to what I am
>>> seeing here.
>>> 
>>> I don't know what to try except to git bisect, but I have not done that yet.
>> 
>> This is a known issue.
>> 
>> A patch series to fix that has been posted:
>> 
>> https://lore.kernel.org/lkml/20250603111446.2609381-1-rppt@xxxxxxxxxx/
>> 
>> 
>> Juergen
> 
> Yes, that patch set (the original 5 patches) fixes this issue (I
> tested it on top of 6.15.2 released yesterday).
> 
> There is a suggested sixth patch in the thread, and I tried that
> also but it caused a kernel panic in Xen PV dom0.
> 
> Thanks,
> 
> Chuck Zmudzinski

The fix for this issue landed in Linux 6.15.4. Thanks!

Chuck Zmudzinski




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.