[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] x86/nmi: start NMI watchdog on CPU0 after SMP bootstrap
On Mon, 19 Feb 2018 14:48:37 +0000 Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: >On 19/02/18 14:23, Igor Druzhinin wrote: >> We're noticing a reproducible system boot hang on certain >> post-Skylake platforms where the BIOS is configured in > >These are Skylake, not post-Skylake. Well, strictly speaking, any platforms which made obsolete REF_TOGGLE bit in NMI_SC (I/O port 61h) and provide sw emulation for it via the SMI I/O trap instead. >> legacy boot mode with x2APIC disabled. The system stalls >> immediately after writing the first SMP initialization >> sequence into APIC ICR. >> >> The cause of the problem is watchdog NMI handler execution - >> somewhere near the end of NMI handling (after it's already >> rescheduled the next NMI) it tries to access IO port 0x61 >> to get the actual NMI reason on CPU0. Unfortunately, this >> port is emulated by BIOS using SMIs and this emulation for >> some reason takes more time than we expect during INIT-SIPI-SIPI >> sequence. As the result, the system is constantly moving between >> NMI and SMI handler and not making any progress. >> >> To avoid this, initialize the watchdog after SMP bootstrap on >> CPU0 and, additionally, protect the NMI handler by moving >> IO port access before NMI re-scheduling. The latter should help >> in case of post boot CPU onlining. Although we're running >> watchdog at much lower frequency it's neveretheless possible >> we may trigger the issue anyway. >> >> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >> --- >> v3: corrected comments and coommit meesage. >> --- >> xen/arch/x86/apic.c | 2 +- >> xen/arch/x86/smpboot.c | 3 +++ >> xen/arch/x86/traps.c | 12 ++++++++++-- >> 3 files changed, 14 insertions(+), 3 deletions(-) >> >> diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c >> index 5039173..ffa5a69 100644 >> --- a/xen/arch/x86/apic.c >> +++ b/xen/arch/x86/apic.c >> @@ -684,7 +684,7 @@ void setup_local_APIC(void) >> printk("Leaving ESR disabled.\n"); >> } >> >> - if (nmi_watchdog == NMI_LOCAL_APIC) >> + if (nmi_watchdog == NMI_LOCAL_APIC && smp_processor_id()) >> setup_apic_nmi_watchdog(); >> apic_pm_activate(); >> } >> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c >> index 2ebef03..1844116 100644 >> --- a/xen/arch/x86/smpboot.c >> +++ b/xen/arch/x86/smpboot.c >> @@ -1248,7 +1248,10 @@ int __cpu_up(unsigned int cpu) >> void __init smp_cpus_done(void) >> { >> if ( nmi_watchdog == NMI_LOCAL_APIC ) >> + { >> + setup_apic_nmi_watchdog(); >> check_nmi_watchdog(); >> + } >> >> setup_ioapic_dest(); >> >> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c >> index 2e022b0..e6c7487 100644 >> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -1706,7 +1706,7 @@ static nmi_callback_t *nmi_callback = >> dummy_nmi_callback; void do_nmi(const struct cpu_user_regs *regs) >> { >> unsigned int cpu = smp_processor_id(); >> - unsigned char reason; >> + unsigned char reason = 0; >> bool handle_unknown = false; >> >> ++nmi_count(cpu); >> @@ -1714,6 +1714,15 @@ void do_nmi(const struct cpu_user_regs *regs) >> if ( nmi_callback(regs, cpu) ) >> return; >> >> + /* >> + * There is a chance that this IO port access will produce SMI >> which, >> + * in turn, may take enough time for the next NMI tick to >> happen. >> + * To avoid having nested NMIs as the result let's do it before >> + * watchdog re-scheduling. > >This isn't strictly accurate. How about: > >/* Reads of 0x61 may trap to SMM, and on production SKX servers, have >been observed to take up to 200ms to complete. By reading this port >before we re-arm the NMI watchdog, we reduce the chance of having an >NMI watchdog expire while in the SMI handler. */ > >In particular, if we are servicing a non-watchdog NMI, the watchdog >will still be counting down while the SMI executes. > >~Andrew > >> + */ >> + if ( cpu == 0 ) >> + reason = inb(0x61); >> + >> if ( (nmi_watchdog == NMI_NONE) || >> (!nmi_watchdog_tick(regs) && watchdog_force) ) >> handle_unknown = true; >> @@ -1721,7 +1730,6 @@ void do_nmi(const struct cpu_user_regs *regs) >> /* Only the BSP gets external NMIs from the system. */ >> if ( cpu == 0 ) >> { >> - reason = inb(0x61); >> if ( reason & 0x80 ) >> pci_serr_error(regs); >> if ( reason & 0x40 ) > > >_______________________________________________ >Xen-devel mailing list >Xen-devel@xxxxxxxxxxxxxxxxxxxx >https://lists.xenproject.org/mailman/listinfo/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |