[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] x86/nmi: start NMI watchdog on CPU0 after SMP bootstrap
>>> On 19.02.18 at 16:20, <igor.druzhinin@xxxxxxxxxx> wrote: > On 19/02/18 15:18, Jan Beulich wrote: >>>>> On 19.02.18 at 15:23, <igor.druzhinin@xxxxxxxxxx> wrote: >>> We're noticing a reproducible system boot hang on certain >>> post-Skylake platforms where the BIOS is configured in >>> legacy boot mode with x2APIC disabled. The system stalls >>> immediately after writing the first SMP initialization >>> sequence into APIC ICR. >>> >>> The cause of the problem is watchdog NMI handler execution - >>> somewhere near the end of NMI handling (after it's already >>> rescheduled the next NMI) it tries to access IO port 0x61 >>> to get the actual NMI reason on CPU0. Unfortunately, this >>> port is emulated by BIOS using SMIs and this emulation for >>> some reason takes more time than we expect during INIT-SIPI-SIPI >>> sequence. As the result, the system is constantly moving between >>> NMI and SMI handler and not making any progress. >>> >>> To avoid this, initialize the watchdog after SMP bootstrap on >>> CPU0 and, additionally, protect the NMI handler by moving >>> IO port access before NMI re-scheduling. The latter should help >>> in case of post boot CPU onlining. Although we're running >>> watchdog at much lower frequency it's neveretheless possible >>> we may trigger the issue anyway. >> >> I'm afraid I can't connect "the latter" to anything earlier in the >> description. > > It's the previous sentence - there are 2 things that we do here - the > latter is "protect the NMI handler by moving IO port access before NMI > re-scheduling" Oh, I thought you mean to refer to the lower frequency. How about "The latter should also help ..."? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |