[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Yet another S3 issue in Xen 4.14
On Fri, Oct 02, 2020 at 09:19:55PM +0200, Marek Marczykowski-Górecki wrote: > Disabling efi_get_time() or setting CR4 earlier solves _this_ issue, but > applied on top of stable-4.14 still doesn't work. Looks like there is > yet another S3 breakage in between. I'm bisecting it further... This time I get to this commit: commit 8e2aa76dc1670e82eaa15683353853bc66bf54fc (refs/bisect/bad) Author: Dario Faggioli <dfaggioli@xxxxxxxx> Date: Thu May 28 23:29:44 2020 +0200 xen: credit2: limit the max number of CPUs in a runqueue The failing effect after S3 resume is slightly different and not really deterministic - sometimes it hangs immediately, sometimes the system is interactive for few seconds and then hangs and sometimes it crashes (looks like panic). I've tried to switch to credit1, but this seems to be broken in yet another way, much earlier (commits at which S3 works with credit2, crashes on S3 resume with credit1). (few hours later) I managed to setup kdump kernel and got a copy of vmcore after the crash. Then extracted crash message using strings: (XEN) Entering ACPI S3 state. (XEN) [VT-D]Passed iommu=no-igfx option. Disabling IGD VT-d engine. (XEN) mce_intel.c:773: MCA Capability: firstbank 0, extended MCE MSR 0, BCAST, CMCI (XEN) CPU0 CMCI LVT vector (0xf1) already installed (XEN) Finishing wakeup from ACPI S3 state. (XEN) Enabling non-boot CPUs ... (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) [VT-D]intremap.c:564: MSI index (65535) has an empty entry (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2273 (XEN) ----[ Xen-4.14-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 8 (XEN) RIP: e008:[<ffff82d040242725>] credit2.c#csched2_unit_wake+0x14f/0x151 (XEN) RFLAGS: 0000000000010087 CONTEXT: hypervisor (d0v0) (XEN) rax: ffff830250b609e0 rbx: ffff830250b18f10 rcx: 0000003210631000 (XEN) rdx: ffff830250b604a0 rsi: 0000000000000008 rdi: ffff830250b60846 (XEN) rbp: ffff830250ba7d98 rsp: ffff830250ba7d78 r8: deadbeefdeadf00d (XEN) r9: deadbeefdeadf00d r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: ffff830250b0e040 r13: ffff82d04044abc0 r14: 0000000000000008 (XEN) r15: 2f3d053d56f91b80 cr0: 0000000080050033 cr4: 0000000000362660 (XEN) cr3: 0000000210270000 cr2: 0000000000000000 (XEN) fsb: 000077a6b25a2b80 gsb: ffff8881b5400000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen code around <ffff82d040242725> (credit2.c#csched2_unit_wake+0x14f/0x151): (XEN) df e8 dc bd ff ff eb ad <0f> 0b 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 (XEN) Xen stack trace from rsp=ffff830250ba7d78: (XEN) ffff830250b10000 ffff830250b18f10 ffff830250b18f10 ffff830250b60840 (XEN) ffff830250ba7de8 ffff82d04024b8eb 0000000000000202 ffff830250b60840 (XEN) ffff830250b66018 0000000000000001 0000000000000000 0000000000000000 (XEN) ffff830250b66018 ffff830250b10000 ffff830250ba7e58 ffff82d040207c3f (XEN) ffff82d0403673d4 ffff82d0403673c8 ffff82d0403673d4 ffff82d0403673c8 (XEN) ffff82d0403673d4 ffff82d0403673c8 ffff82d0403673d4 ffff830250ba7ef8 (XEN) 0000000000000180 ffff830250b45000 deadbeefdeadf00d 0000000000000003 (XEN) ffff830250ba7ee8 ffff82d0402e7759 0000000000000001 0000000000000005 (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d ffff82d0403673c8 (XEN) ffff82d0403673d4 ffff82d0403673c8 ffff82d0403673d4 ffff82d0403673c8 (XEN) ffff82d0403673d4 ffff830250b45000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 00007cfdaf4580e7 ffff82d040367432 (XEN) ffff888192157320 0000000000000000 ffffffff810eb370 0000000000000000 (XEN) ffff8881b0a626c0 0000000000000005 0000000000000246 0000000000000001 (XEN) ffffea0006087608 ffffea0006087608 0000000000000018 ffffffff8100230a (XEN) 0000000000000000 0000000000000005 0000000000000001 0000010000000000 (XEN) ffffffff8100230a 000000000000e033 0000000000000246 ffffc90002653cf0 (XEN) 000000000000e02b d2c2c2c2c2c2c2c2 c2c2c2c2c2c2c282 c2c2c2c2c2c2c2c2 (XEN) c2e2c2c2c2c2c2c2 0000e01000000008 ffff830250b45000 0000003210631000 (XEN) 0000000000362660 0000000000000000 8000000250bc3002 0000060100000000 (XEN) Xen call trace: (XEN) [<ffff82d040242725>] R credit2.c#csched2_unit_wake+0x14f/0x151 (XEN) [<ffff82d04024b8eb>] F vcpu_wake+0x105/0x52c (XEN) [<ffff82d040207c3f>] F do_vcpu_op+0x1b0/0x631 (XEN) [<ffff82d0402e7759>] F pv_hypercall+0x28f/0x57d (XEN) [<ffff82d040367432>] F lstar_enter+0x112/0x120 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 8: (XEN) Assertion 'c2rqd(sched_unit_master(unit)) == svc->rqd' failed at credit2.c:2273 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... (XEN) Executing kexec image on cpu8 (XEN) Shot down all CPUs Looks pretty similar to the other thread "Xen crash after S3 suspend - Xen 4.13" - adding Jürgen. Since I've seen this one on Xen 4.13 before, I think the commit I've found just makes it much more likely to happen. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |