Xen project Mailing List

Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer

From: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

Date: Tue, 6 Sep 2022 14:35:08 +0200

Cc: Juergen Gross <jgross@xxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 06 Sep 2022 12:35:21 +0000

Feedback-id: i1568416f:Fastmail

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Sep 06, 2022 at 01:46:55PM +0200, Juergen Gross wrote: > On 22.08.22 17:34, Juergen Gross wrote: > > On 21.08.22 18:14, Marek Marczykowski-Górecki wrote: > > > On Sat, Oct 09, 2021 at 06:28:17PM +0200, Marek Marczykowski-Górecki > > > wrote: > > > > On Sun, Jan 31, 2021 at 03:15:30AM +0100, Marek Marczykowski-Górecki > > > > wrote: > > > > > On Tue, Sep 29, 2020 at 05:27:48PM +0200, Jürgen Groß wrote: > > > > > > On 29.09.20 17:16, Marek Marczykowski-Górecki wrote: > > > > > > > On Tue, Sep 29, 2020 at 05:07:11PM +0200, Jürgen Groß wrote: > > > > > > > > On 29.09.20 16:27, Marek Marczykowski-Górecki wrote: > > > > > > > > > On Mon, Mar 23, 2020 at 01:09:49AM +0100, Marek > > > > > > > > > Marczykowski-Górecki wrote: > > > > > > > > > > On Thu, Mar 19, 2020 at 01:28:10AM +0100, Dario Faggioli > > > > > > > > > > wrote: > > > > > > > > > > > [Adding Juergen] > > > > > > > > > > > > > > > > > > > > > > On Wed, 2020-03-18 at 23:10 +0100, Marek > > > > > > > > > > > Marczykowski-Górecki wrote: > > > > > > > > > > > > On Wed, Mar 18, 2020 at 02:50:52PM +0000, Andrew Cooper > > > > > > > > > > > > wrote: > > > > > > > > > > > > > On 18/03/2020 14:16, Marek Marczykowski-Górecki wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > > > > > In my test setup (inside KVM with nested virt > > > > > > > > > > > > > > enabled), I rather > > > > > > > > > > > > > > frequently get Xen crash on resume from S3. Full > > > > > > > > > > > > > > message below. > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is Xen 4.13.0, with some patches, including > > > > > > > > > > > > > > "sched: fix > > > > > > > > > > > > > > resuming > > > > > > > > > > > > > > from S3 with smt=0". > > > > > > > > > > > > > > > > > > > > > > > > > > > > Contrary to the previous issue, this one does not > > > > > > > > > > > > > > happen always - > > > > > > > > > > > > > > I > > > > > > > > > > > > > > would say in about 40% cases on this setup, but > > > > > > > > > > > > > > very rarely on > > > > > > > > > > > > > > physical > > > > > > > > > > > > > > setup. > > > > > > > > > > > > > > > > > > > > > > > > > > > > This is _without_ core scheduling enabled, and also > > > > > > > > > > > > > > with smt=off. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you think it would be any different on > > > > > > > > > > > > > > xen-unstable? I cat > > > > > > > > > > > > > > try, but > > > > > > > > > > > > > > it isn't trivial in this setup, so I'd ask first. > > > > > > > > > > > > > > > > > > > > > > > > > Well, Juergen has fixed quite a few issues. > > > > > > > > > > > > > > > > > > > > > > Most of them where triggering with core-scheduling > > > > > > > > > > > enabled, and I don't > > > > > > > > > > > recall any of them which looked similar or related to > > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > Still, it's possible that the same issue causes different > > > > > > > > > > > symptoms, and > > > > > > > > > > > hence that maybe one of the patches would fix this too. > > > > > > > > > > > > > > > > > > > > I've tested on master (d094e95fb7c), and reproduced exactly > > > > > > > > > > the same crash > > > > > > > > > > (pasted below for the completeness). > > > > > > > > > > But there is more: additionally, in most > > > > > > > > > > (all?) cases after resume I've got > > > > > > > > > > soft lockup in Linux dom0 in smp_call_function_single() - > > > > > > > > > > see below. It > > > > > > > > > > didn't happened before and the only change was Xen 4.13 -> > > > > > > > > > > master. > > > > > > > > > > > > > > > > > > > > Xen crash: > > > > > > > > > > > > > > > > > > > > (XEN) Assertion > > > > > > > > > > 'c2rqd(sched_unit_master(unit)) == > > > > > > > > > > svc->rqd' failed at credit2.c:2133 > > > > > > > > > > > > > > > > > > Juergen, any idea about this one? This is also happening on > > > > > > > > > the current > > > > > > > > > stable-4.14 (28855ebcdbfa). > > > > > > > > > > > > > > > > > > > > > > > > > Oh, sorry I didn't come back to this issue. > > > > > > > > > > > > > > > > I suspect this is related to stop_machine_run() being called > > > > > > > > during > > > > > > > > suspend(), as I'm seeing very sporadic issues when offlining > > > > > > > > and then > > > > > > > > onlining cpus with core scheduling being active (it seems as if > > > > > > > > the > > > > > > > > dom0 vcpu doing the cpu online activity sometimes is using an > > > > > > > > old > > > > > > > > vcpu state). > > > > > > > > > > > > > > Note this is default Xen 4.14 start, so core scheduling is _not_ > > > > > > > active: > > > > > > > > > > > > The similarity in the two failure cases is that multiple cpus are > > > > > > affected by the operations during stop_machine_run(). > > > > > > > > > > > > > > > > > > > > (XEN) Brought up 2 CPUs > > > > > > > (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource > > > > > > > (XEN) Adding cpu 0 to runqueue 0 > > > > > > > (XEN) First cpu on runqueue, activating > > > > > > > (XEN) Adding cpu 1 to runqueue 1 > > > > > > > (XEN) First cpu on runqueue, activating > > > > > > > > > > > > > > > I wasn't able to catch the real problem despite of having tried > > > > > > > > lots > > > > > > > > of approaches using debug patches. > > > > > > > > > > > > > > > > Recently I suspected the whole problem could be somehow related > > > > > > > > to > > > > > > > > RCU handling, as stop_machine_run() is relying on tasklets > > > > > > > > which are > > > > > > > > executing in idle context, and RCU handling is done in idle > > > > > > > > context, > > > > > > > > too. So there might be some kind of use after free scenario in > > > > > > > > case > > > > > > > > some memory is freed via RCU despite it still being used by a > > > > > > > > tasklet. > > > > > > > > > > > > > > That sounds plausible, even though I don't really know this area > > > > > > > of Xen. > > > > > > > > > > > > > > > I "just" need to find some time to verify this suspicion. Any > > > > > > > > help doing > > > > > > > > this would be appreciated. :-) > > > > > > > > > > > > > > I do have a setup where I can easily-ish reproduce the issue. If > > > > > > > there > > > > > > > is some debug patch you'd like me to try, I can do that. > > > > > > > > > > > > Thanks. I might come back to that offer as you are seeing a crash > > > > > > which > > > > > > will be much easier to analyze. Catching my error case is much > > > > > > harder as > > > > > > it surfaces some time after the real problem in a non destructive > > > > > > way > > > > > > (usually I'm seeing a failure to load a library in the program which > > > > > > just did its job via exactly the library claiming not being > > > > > > loadable). > > > > > > > > > > Hi, > > > > > > > > > > I'm resurrecting this thread as it was recently mentioned elsewhere. I > > > > > can still reproduce the issue on the recent staging branch > > > > > (9dc687f155). > > > > > > > > > > It fails after the first resume (not always, but frequent enough to > > > > > debug it). At least one guest needs to be running - with just (PV) > > > > > dom0 > > > > > the crash doesn't happen (at least for the ~8 times in a row I tried). > > > > > If the first resume works, the second (almost?) always will fail but > > > > > with a different symptoms - dom0 kernel lockups (at least some of its > > > > > vcpus). I haven't debugged this one yet at all. > > > > > > > > > > Any help will be appreciated, I can apply some debug patches, change > > > > > configuration etc. > > > > > > > > This still happens on 4.14.3. Maybe it is related to freeing percpu > > > > areas, as it caused other issues with suspend too? Just a thought... > > > > > > I have reproduced this on current staging(*). And I can reproduce it > > > reliably. And also, I got (I believe) closely related crash with credit1 > > > scheduler. > > > > > > (*) It isn't plain staging, it's one with my xhci console patches on > > > top, including attempt to make it survive S3. I believe the only > > > relevant part there is sticking set_timer() into console resume path (or > > > just having a timer with rather short delay registered). The actual tree > > > at https://github.com/marmarek/xen/tree/master-xue2-debug, including > > > quite a lot of debug prints and debug hacks. > > > > > > Specific crash with credit2: > > > > > > (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:928 > > > (XEN) ----[ Xen-4.17-unstable x86_64 debug=y Tainted: C ]---- > > > (XEN) CPU: 0 > > > (XEN) RIP: e008:[<ffff82d0402434bf>] > > > credit.c#csched_tick+0x2d4/0x494 > > > (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor (d0v4) > > > (XEN) rax: ffff82d0405c4298 rbx: 0000000000000002 rcx: > > > 0000000000000002 > > > (XEN) rdx: ffff8302517f64d0 rsi: ffff8302515c0fc0 rdi: > > > 0000000000000002 > > > (XEN) rbp: ffff830256227e38 rsp: ffff830256227de0 r8: > > > 0000000000000004 > > > (XEN) r9: ffff8302517ac820 r10: ffff830251745068 r11: > > > 00000088cb734887 > > > (XEN) r12: ffff83025174de50 r13: ffff8302515c0fa0 r14: > > > ffff83025174df40 > > > (XEN) r15: ffff8302515c0cc0 cr0: 0000000080050033 cr4: > > > 0000000000372660 > > > (XEN) cr3: 00000001bacbd000 cr2: 000077e5ec02a318 > > > (XEN) fsb: 000077e5fe533700 gsb: ffff888255700000 gss: > > > 0000000000000000 > > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > > > (XEN) Xen code around <ffff82d0402434bf> > > > (credit.c#csched_tick+0x2d4/0x494): > > > (XEN) 01 00 00 e9 2a 01 00 00 <0f> 0b 0f 0b 0f 0b 48 8b 41 20 > > > 0f b7 00 89 45 cc > > > (XEN) Xen stack trace from rsp=ffff830256227de0: > > > (XEN) ffff830256227fff 0000000000000000 0000000256227e10 > > > ffff82d04035be90 > > > (XEN) ffff830256227ef8 ffff830251745000 ffff82d0405c3280 > > > ffff82d0402431eb > > > (XEN) 0000000000000002 00000088c9ba9534 0000000000000000 > > > ffff830256227e60 > > > (XEN) ffff82d04022ee53 ffff82d0405c3280 ffff8302963e1320 > > > ffff8302515c0fc0 > > > (XEN) ffff830256227ea0 ffff82d04022f73f ffff830256227e80 > > > ffff82d0405c9f00 > > > (XEN) ffffffffffffffff ffff82d0405c9f00 ffff830256227fff > > > 0000000000000000 > > > (XEN) ffff830256227ed8 ffff82d04022d26c ffff830251745000 > > > 0000000000000000 > > > (XEN) 0000000000000000 ffff830256227fff 0000000000000000 > > > ffff830256227ee8 > > > (XEN) ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f03c6 > > > ffff88810c005c00 > > > (XEN) 0000000000000031 0000000000000100 00000000fffffe00 > > > 0000000000000031 > > > (XEN) 0000000000000031 ffffffff82d45d28 0000000000000e2e > > > 0000000000000000 > > > (XEN) 0000000000000032 00000000ffffef31 0000000000000000 > > > ffff88812244a700 > > > (XEN) 0000000000000005 ffff88812244a780 000000fa00000000 > > > ffffffff818db55f > > > (XEN) 000000000000e033 0000000000000246 ffffc900409b7c50 > > > 000000000000e02b > > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > > 0000000000000000 > > > (XEN) 0000e01000000000 ffff830251745000 0000000000000000 > > > 0000000000372660 > > > (XEN) 0000000000000000 800000025620b002 000e030300000001 > > > 0000000000000000 > > > (XEN) Xen call trace: > > > (XEN) [<ffff82d0402434bf>] R credit.c#csched_tick+0x2d4/0x494 > > > (XEN) [<ffff82d04022ee53>] F timer.c#execute_timer+0x45/0x5c > > > (XEN) [<ffff82d04022f73f>] F > > > timer.c#timer_softirq_action+0x71/0x278 > > > (XEN) [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe > > > (XEN) [<ffff82d04022d2ff>] F do_softirq+0x13/0x15 > > > (XEN) [<ffff82d0402f03c6>] F > > > x86_64/entry.S#process_softirqs+0x6/0x20 > > > (XEN) > > > (XEN) > > > (XEN) **************************************** > > > (XEN) Panic on CPU 0: > > > (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:928 > > > (XEN) **************************************** > > > > > > Specific crash with credit1: > > > > > > (XEN) ASSERT FAIL! cpu 4, sched_unit_master 0, currunit id 4 > > > (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:936 > > > (XEN) ----[ Xen-4.17-unstable x86_64 debug=y Tainted: C ]---- > > > (XEN) CPU: 0 > > > (XEN) RIP: e008:[<ffff82d04024355c>] > > > credit.c#csched_tick+0x2f7/0x4b7 > > > (XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor (d0v4) > > > (XEN) rax: ffff8302517f64d0 rbx: 0000000000000004 rcx: > > > 0000000000000000 > > > (XEN) rdx: ffff830256227fff rsi: 000000000000000a rdi: > > > ffff82d0404786b8 > > > (XEN) rbp: ffff830256227e38 rsp: ffff830256227de0 r8: > > > 0000000000000087 > > > (XEN) r9: 0000000000000001 r10: 0000000000000001 r11: > > > 0000000000000004 > > > (XEN) r12: ffff83025174de50 r13: ffff830251778100 r14: > > > ffff83025174df40 > > > (XEN) r15: ffff8302515c0cc0 cr0: 0000000080050033 cr4: > > > 0000000000372660 > > > (XEN) cr3: 000000024ac10000 cr2: 0000000000000000 > > > (XEN) fsb: 0000000000000000 gsb: ffff888255700000 gss: > > > 0000000000000000 > > > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: 0000 cs: e008 > > > (XEN) Xen code around <ffff82d04024355c> > > > (credit.c#csched_tick+0x2f7/0x4b7): > > > (XEN) 58 70 0f 84 bf fe ff ff <0f> 0b 0f 0b 0f 0b 48 8b 40 20 > > > 0f b7 00 89 45 cc > > > (XEN) Xen stack trace from rsp=ffff830256227de0: > > > (XEN) 000000000046692a ffff8302515c0bc0 000000044025ae27 > > > 002191c0517f64e8 > > > (XEN) ffff82d0405c33e0 ffff8302517c8590 ffff82d0405c3280 > > > ffff82d040243265 > > > (XEN) 0000000000000004 00000026d27a80a7 0000000000000000 > > > ffff830256227e60 > > > (XEN) ffff82d04022ee94 ffff82d0405c3280 ffff8302517e7df0 > > > ffff830251778120 > > > (XEN) ffff830256227ea0 ffff82d04022f77a ffff830256227e80 > > > ffff82d0405c9f00 > > > (XEN) ffffffffffffffff ffff82d0405c9f00 ffff830256227fff > > > 0000000000000000 > > > (XEN) ffff830256227ed8 ffff82d04022d26c ffff830251745000 > > > 0000000000000000 > > > (XEN) 0000000000000000 ffff830256227fff 0000000000000000 > > > ffff830256227ee8 > > > (XEN) ffff82d04022d2ff 00007cfda9dd80e7 ffff82d0402f0496 > > > 0000000000000001 > > > (XEN) ffffc90040127ddc ffffc90040127dd0 ffffc90040127dd4 > > > ffffc90040127dd8 > > > (XEN) 00000000756e6547 0000000000000000 0000000000000000 > > > ffffc90040127ddc > > > (XEN) ffffc90040127dd0 000000000000000d 000000006c65746e > > > 0000000049656e69 > > > (XEN) ffffc90040127dd4 ffffc90040127dd8 000000fa00000000 > > > ffffffff8101c5a4 > > > (XEN) 000000000000e033 0000000000000293 ffffc90040127db8 > > > 000000000000e02b > > > (XEN) 0000000000000000 0000000000000000 0000000000000000 > > > 0000000000000000 > > > (XEN) 0000e01000000000 ffff830251745000 0000000000000000 > > > 0000000000372660 > > > (XEN) 0000000000000000 800000025620b002 000e030300000001 > > > 0000000000000000 > > > (XEN) Xen call trace: > > > (XEN) [<ffff82d04024355c>] R credit.c#csched_tick+0x2f7/0x4b7 > > > (XEN) [<ffff82d04022ee94>] F timer.c#execute_timer+0x45/0x5c > > > (XEN) [<ffff82d04022f77a>] F > > > timer.c#timer_softirq_action+0x71/0x278 > > > (XEN) [<ffff82d04022d26c>] F softirq.c#__do_softirq+0x94/0xbe > > > (XEN) [<ffff82d04022d2ff>] F do_softirq+0x13/0x15 > > > (XEN) [<ffff82d0402f0496>] F > > > x86_64/entry.S#process_softirqs+0x6/0x20 > > > (XEN) > > > (XEN) **************************************** > > > (XEN) Panic on CPU 0: > > > (XEN) Assertion 'sched_unit_master(currunit) == cpu' failed at > > > common/sched/credit.c:936 > > > (XEN) **************************************** > > > > > > I'm not sure about the nature of credit2 crash, but for credit1 it looks > > > much simpler to understand. Credit1 registers a timer for each pcpu (in > > > init_pdata()). Then, the timer function > > > (csched_tick()->csched_unit_acct()) asserts if it was really called on > > > correct CPU. The issue is that suspend path migrates all timers to CPU0 > > > (timerc.:cpu_callback->migrate_timers_from_cpu()), and restore path > > > doesn't migrate them back. All this while not unregistering pCPUs from > > > the scheduler (sched_deinit_pdata() / init_pdata() is not called - > > > confirmed via debug prints). > > > > > > I'm not exactly sure if that's the same issue that applies to credit2, > > > but I think it's likely. A naive fix could be unregistering all pCPUs > > > from the scheduler (an registering them back on resume), but this may > > > mess up CPU pools, and probably few other things. Any better ideas? I > > > can test various patches, provide more debug info etc - should be rather > > > straightforward given I have reliable reproducer now. > > > > Unregistering the cpus from the scheduler would break core scheduling. > > > > I'll have a look if the timers can just be deactivated in order not to > > have them migrated to cpu0. > > Could you test the attached patch, please? I did a test with only dom0 running, and it works now. It isn't a comprehensive test, but just dom0 was enough to crash it before, and it stays working now. Thanks! Tested-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > > > Juergen > From a2c224bee53e2216d41b34518bb2d11b64d63b6a Mon Sep 17 00:00:00 2001 > From: Juergen Gross <jgross@xxxxxxxx> > To: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: George Dunlap <george.dunlap@xxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Julien Grall <julien@xxxxxxx> > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> > Cc: Wei Liu <wl@xxxxxxx> > Date: Tue, 6 Sep 2022 13:39:48 +0200 > Subject: [PATCH] xen/timer: don't migrate timers away from cpus during suspend > > During a suspend/resume cycle timers on all cpus but cpu 0 will be > migrated to cpu 0, as the other cpus are taken down. > > This is problematic in case such a timer is related to a specific vcpu, > as the vcpus are not migrated to another cpu during suspend (migrating > them would break cpupools and core scheduling). > > In order to avoid the problems just try to keep the timers on their > cpus. Only migrate them away in case resume failed. Doing so isn't > problematic, as any vcpu on a cpu not coming back to life would be > migrated away, too. > > Signed-off-by: Juergen Gross <jgross@xxxxxxxx> > --- > xen/common/timer.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/xen/common/timer.c b/xen/common/timer.c > index 9b5016d5ed..6b5473e5f1 100644 > --- a/xen/common/timer.c > +++ b/xen/common/timer.c > @@ -637,6 +637,13 @@ static void free_percpu_timers(unsigned int cpu) > ASSERT(ts->heap == dummy_heap); > } > > +static void deinit_timers(unsigned int cpu) > +{ > + migrate_timers_from_cpu(cpu); > + if ( !park_offline_cpus ) > + free_percpu_timers(cpu); > +} > + > static int cf_check cpu_callback( > struct notifier_block *nfb, unsigned long action, void *hcpu) > { > @@ -655,13 +662,14 @@ static int cf_check cpu_callback( > } > break; > > - case CPU_UP_CANCELED: > case CPU_DEAD: > - case CPU_RESUME_FAILED: > - migrate_timers_from_cpu(cpu); > + if ( system_state != SYS_STATE_suspend ) > + deinit_timers(cpu); > + break; > > - if ( !park_offline_cpus && system_state != SYS_STATE_suspend ) > - free_percpu_timers(cpu); > + case CPU_UP_CANCELED: > + case CPU_RESUME_FAILED: > + deinit_timers(cpu); > break; > > case CPU_REMOVE: > -- > 2.35.3 > -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab

Attachment: signature.asc
Description: PGP signature

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.