[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Problems with APIC on versions 4.9 and later (4.8 works)
On 25.01.2021 20:37, Claudemir Todo Bom wrote: > I've managed to get the debug messages on the screen using > vga=text-80x50,keep and disabling all messages from the kernel. Two > images are attached with the output running the debug patch. And the 1st of them (161303) was taken at the time of the hang of the kernel (or entire system), not any earlier? I ask because one part of the reason for the patch was to understand whether the rendezvousing itself would fail in some way (like one of the CPUs not calling in). Were new log messages (from the debugging patch) still issued at this point, showing Xen itself was still alive? The 2nd of the pictures (162313) at least clarifies that indeed the commit in question had a functional effect on this system, because of (XEN) TSC warp detected, disabling TSC_RELIABLE I still can't figure though why the change in rendezvous handling (from "std" to "tsc") would have broken your system. > About the version I've used to test: since the 4.14 shows that other > bug with the detection of cpu features I mentioned on the other > subthread, I chose to work on 4.11 that doesn't shows that behaviour. > > Calling with clocksource on the xen command line changed nothing. Oh, right, because the specific feature that causes the change of rendezvous functions for you also is a prereq for that mode of operation. > I don't know if this part of code is intended to execute a lot of > times, but when starting with dom0_max_vcpus=1, the system boots up > and keeps showing the messages. When there's just one CPU, there's no CPU to rendezvous with. Iirc you did say that you observe the hang even with as little as 2 CPUs? The problem the above quoted message is supposed to address is normally coming into play only on multi-socket systems. Yet from your initial report I deduce this is a single socket system. So in the end I suppose there are two problems - one is the hang, and the other is that your system gets diagnosed as having an unreliable TSC (at least I didn't think Xeon E5 v2 should have a problem there). I will want to extend the debugging patch, but I'd like to have clarification on some of the points above first. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |