[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] x86/nmi: start NMI watchdog on CPU0 after SMP bootstrap
On 19/02/18 14:23, Igor Druzhinin wrote: > We're noticing a reproducible system boot hang on certain > post-Skylake platforms where the BIOS is configured in These are Skylake, not post-Skylake. > legacy boot mode with x2APIC disabled. The system stalls > immediately after writing the first SMP initialization > sequence into APIC ICR. > > The cause of the problem is watchdog NMI handler execution - > somewhere near the end of NMI handling (after it's already > rescheduled the next NMI) it tries to access IO port 0x61 > to get the actual NMI reason on CPU0. Unfortunately, this > port is emulated by BIOS using SMIs and this emulation for > some reason takes more time than we expect during INIT-SIPI-SIPI > sequence. As the result, the system is constantly moving between > NMI and SMI handler and not making any progress. > > To avoid this, initialize the watchdog after SMP bootstrap on > CPU0 and, additionally, protect the NMI handler by moving > IO port access before NMI re-scheduling. The latter should help > in case of post boot CPU onlining. Although we're running > watchdog at much lower frequency it's neveretheless possible > we may trigger the issue anyway. > > Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > --- > v3: corrected comments and coommit meesage. > --- > xen/arch/x86/apic.c | 2 +- > xen/arch/x86/smpboot.c | 3 +++ > xen/arch/x86/traps.c | 12 ++++++++++-- > 3 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c > index 5039173..ffa5a69 100644 > --- a/xen/arch/x86/apic.c > +++ b/xen/arch/x86/apic.c > @@ -684,7 +684,7 @@ void setup_local_APIC(void) > printk("Leaving ESR disabled.\n"); > } > > - if (nmi_watchdog == NMI_LOCAL_APIC) > + if (nmi_watchdog == NMI_LOCAL_APIC && smp_processor_id()) > setup_apic_nmi_watchdog(); > apic_pm_activate(); > } > diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c > index 2ebef03..1844116 100644 > --- a/xen/arch/x86/smpboot.c > +++ b/xen/arch/x86/smpboot.c > @@ -1248,7 +1248,10 @@ int __cpu_up(unsigned int cpu) > void __init smp_cpus_done(void) > { > if ( nmi_watchdog == NMI_LOCAL_APIC ) > + { > + setup_apic_nmi_watchdog(); > check_nmi_watchdog(); > + } > > setup_ioapic_dest(); > > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index 2e022b0..e6c7487 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -1706,7 +1706,7 @@ static nmi_callback_t *nmi_callback = > dummy_nmi_callback; > void do_nmi(const struct cpu_user_regs *regs) > { > unsigned int cpu = smp_processor_id(); > - unsigned char reason; > + unsigned char reason = 0; > bool handle_unknown = false; > > ++nmi_count(cpu); > @@ -1714,6 +1714,15 @@ void do_nmi(const struct cpu_user_regs *regs) > if ( nmi_callback(regs, cpu) ) > return; > > + /* > + * There is a chance that this IO port access will produce SMI which, > + * in turn, may take enough time for the next NMI tick to happen. > + * To avoid having nested NMIs as the result let's do it before > + * watchdog re-scheduling. This isn't strictly accurate. How about: /* Reads of 0x61 may trap to SMM, and on production SKX servers, have been observed to take up to 200ms to complete. By reading this port before we re-arm the NMI watchdog, we reduce the chance of having an NMI watchdog expire while in the SMI handler. */ In particular, if we are servicing a non-watchdog NMI, the watchdog will still be counting down while the SMI executes. ~Andrew > + */ > + if ( cpu == 0 ) > + reason = inb(0x61); > + > if ( (nmi_watchdog == NMI_NONE) || > (!nmi_watchdog_tick(regs) && watchdog_force) ) > handle_unknown = true; > @@ -1721,7 +1730,6 @@ void do_nmi(const struct cpu_user_regs *regs) > /* Only the BSP gets external NMIs from the system. */ > if ( cpu == 0 ) > { > - reason = inb(0x61); > if ( reason & 0x80 ) > pci_serr_error(regs); > if ( reason & 0x40 ) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |