[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [BUG] XEN crash and double fault when doing cpu online/offline
On 08.01.20 09:32, Tao Xu wrote: On 1/8/20 3:50 PM, Jürgen Groß wrote:On 08.01.20 06:50, Tao Xu wrote:Hi,When I use xen-hptool cpu-offline/cpu-online to let CPU in a socket online/offline using the script as follows:for((j=48;j<=95;j++)); do xen-hptool cpu-offline $j done for((j=48;j<=95;j++)); do xen-hptool cpu-online $j doneXen crash when cpu re-online. I use the upstream XEN(0dd92688) and try many days, it still crash. But if I only do cpu online/offline for CPU 48~59, Xen will not crash. The bug can be reproduced when we do cpu online/offline for most CPU in a socket. And interesting thing is when we use the script as follow:for((j=48;j<=95;j++)); do xen-hptool cpu-offline $j xen-hptool cpu-online $j done Xen will not crash too. Is there a bug in sched_credit2? The crash message as follows: (XEN) Adding cpu 77 to runqueue 1 (XEN) Adding cpu 78 to runqueue 1 (XEN) Adding cpu 79 to runqueue 1 (XEN) Adding cpu 80 to runqueue 1 (X(ENXE) N) *** DOUBLE FAULT ***(XEN) Assertion 'debug->cpu == smp_processor_id()' failed at spinlock.c:88(XEN) ----[ Xen-4.14-unstable x86_64 debug=y Not tainted ]---- (XEN) Debugging connection not set up. (XEN) CPU: 48 (XEN) ----[ Xen-4.14-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82d080240bfc>] _spin_unlock+0x40/0x42So the original problem causes a double fault, but spinlock debugging causes a subsequent panic. Can you please retry the tests with the attached patch? It should result in diagnostic data related to the real problem. JuergenHi Juergen,After apply your patch, spin_lock still assert. And the address ffff82d0bffce880 is not in the xen-syms. Yes, I had a bug in my modified ASSERT(), but this time the data is better. (XEN) Adding cpu 78 to runqueue 1 (XEN) *** DOUBLE FAULT *** (XEN) ----[ Xen-4.14-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 49 (XEN) RIP: e008:[<ffff82d0bffce880>] ffff82d0bffce880 This seems to be a crash in the stub page of cpu 48. I don't think this is related to the scheduler, but to stub page handling. Can you please try the attached patch? Juergen Attachment:
0001-xen-x86-clear-per-cpu-stub-page-information-in-cpu_s.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |