[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-changelog] [xen stable-4.9] x86/nmi: start NMI watchdog on CPU0 after SMP bootstrap



commit 56d4eb8ed87322e874a7d5d2c8d81f9dcc3aadf9
Author:     Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
AuthorDate: Tue Mar 6 16:04:13 2018 +0100
Commit:     Jan Beulich <jbeulich@xxxxxxxx>
CommitDate: Tue Mar 6 16:04:13 2018 +0100

    x86/nmi: start NMI watchdog on CPU0 after SMP bootstrap
    
    We're noticing a reproducible system boot hang on certain
    Skylake platforms where the BIOS is configured in legacy
    boot mode with x2APIC disabled. The system stalls immediately
    after writing the first SMP initialization sequence into APIC ICR.
    
    The cause of the problem is watchdog NMI handler execution -
    somewhere near the end of NMI handling (after it's already
    rescheduled the next NMI) it tries to access IO port 0x61
    to get the actual NMI reason on CPU0. Unfortunately, this
    port is emulated by BIOS using SMIs and this emulation for
    some reason takes more time than we expect during INIT-SIPI-SIPI
    sequence. As the result, the system is constantly moving between
    NMI and SMI handler and not making any progress.
    
    To avoid this, initialize the watchdog after SMP bootstrap on
    CPU0 and, additionally, protect the NMI handler by moving
    IO port access before NMI re-scheduling. The latter should also
    help in case of post boot CPU onlining. Although we're running
    watchdog at much lower frequency at this point, it's neveretheless
    possible we may trigger the issue anyway.
    
    Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
    Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
    master commit: a44f1697968e04fcc6145e3bd51c748b57047240
    master date: 2018-02-20 10:16:56 +0100
---
 xen/arch/x86/apic.c    |  2 +-
 xen/arch/x86/smpboot.c |  3 +++
 xen/arch/x86/traps.c   | 13 +++++++++++--
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 0291af1e89..6cad061781 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -687,7 +687,7 @@ void setup_local_APIC(void)
         printk("Leaving ESR disabled.\n");
     }
 
-    if (nmi_watchdog == NMI_LOCAL_APIC)
+    if (nmi_watchdog == NMI_LOCAL_APIC && smp_processor_id())
         setup_apic_nmi_watchdog();
     apic_pm_activate();
 }
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index f65df4e391..570b78f0a0 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -1241,7 +1241,10 @@ int __cpu_up(unsigned int cpu)
 void __init smp_cpus_done(void)
 {
     if ( nmi_watchdog == NMI_LOCAL_APIC )
+    {
+        setup_apic_nmi_watchdog();
         check_nmi_watchdog();
+    }
 
     setup_ioapic_dest();
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index d53d8416f5..c57d367614 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -3676,7 +3676,7 @@ static nmi_callback_t *nmi_callback = dummy_nmi_callback;
 void do_nmi(const struct cpu_user_regs *regs)
 {
     unsigned int cpu = smp_processor_id();
-    unsigned char reason;
+    unsigned char reason = 0;
     bool_t handle_unknown = 0;
 
     ++nmi_count(cpu);
@@ -3684,6 +3684,16 @@ void do_nmi(const struct cpu_user_regs *regs)
     if ( nmi_callback(regs, cpu) )
         return;
 
+    /*
+     * Accessing port 0x61 may trap to SMM which has been actually
+     * observed on some production SKX servers. This SMI sometimes
+     * takes enough time for the next NMI tick to happen. By reading
+     * this port before we re-arm the NMI watchdog, we reduce the chance
+     * of having an NMI watchdog expire while in the SMI handler.
+     */
+    if ( cpu == 0 )
+        reason = inb(0x61);
+
     if ( (nmi_watchdog == NMI_NONE) ||
          (!nmi_watchdog_tick(regs) && watchdog_force) )
         handle_unknown = 1;
@@ -3691,7 +3701,6 @@ void do_nmi(const struct cpu_user_regs *regs)
     /* Only the BSP gets external NMIs from the system. */
     if ( cpu == 0 )
     {
-        reason = inb(0x61);
         if ( reason & 0x80 )
             pci_serr_error(regs);
         if ( reason & 0x40 )
--
generated by git-patchbot for /home/xen/git/xen.git#stable-4.9

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/xen-changelog

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.