[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-changelog] [xen master] cpupools: avoid crashing if shutting down with free CPUs



commit 7ff6f1fc39ebf2fa939d78a6ca3f883adff04bc9
Author:     Dario Faggioli <dario.faggioli@xxxxxxxxxx>
AuthorDate: Wed May 13 15:08:30 2015 +0200
Commit:     Jan Beulich <jbeulich@xxxxxxxx>
CommitDate: Wed May 13 15:08:30 2015 +0200

    cpupools: avoid crashing if shutting down with free CPUs
    
    in fact, before this change, shutting down or suspending the
    system with some CPUs not assigned to any cpupool, would
    crash as follows:
    
      (XEN) Xen call trace:
      (XEN)    [<ffff82d080101757>] disable_nonboot_cpus+0xb5/0x138
      (XEN)    [<ffff82d0801a8824>] enter_state_helper+0xbd/0x369
      (XEN)    [<ffff82d08010614a>] continue_hypercall_tasklet_handler+0x4a/0xb1
      (XEN)    [<ffff82d0801320bd>] do_tasklet_work+0x78/0xab
      (XEN)    [<ffff82d0801323f3>] do_tasklet+0x5e/0x8a
      (XEN)    [<ffff82d080163cb6>] idle_loop+0x56/0x6b
      (XEN)
      (XEN)
      (XEN) ****************************************
      (XEN) Panic on CPU 0:
      (XEN) Xen BUG at cpu.c:191
      (XEN) ****************************************
    
    This is because, for free CPUs, -EBUSY were being returned
    when trying to tear them down, making cpu_down() unhappy.
    
    It is certainly unpractical to forbid shutting down or
    suspenging if there are unassigned CPUs, so this change
    fixes the above by just avoiding returning -EBUSY for those
    CPUs. If shutting off, that does not matter much anyway. If
    suspending, we make sure that the CPUs remain unassigned
    when resuming.
    
    While there, take the chance to:
     - fix the doc comment of cpupool_cpu_remove() (it was
       wrong);
     - improve comments in general around and in cpupool_cpu_remove()
       and cpupool_cpu_add();
     - add a couple of ASSERT()-s for checking consistency.
    
    Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
    Reviewed-by: Juergen Gross <jgross@xxxxxxxx>
    Tested-by: Juergen Gross <jgross@xxxxxxxx>
---
 xen/common/cpupool.c |   82 ++++++++++++++++++++++++++++++++++---------------
 1 files changed, 57 insertions(+), 25 deletions(-)

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index cabaccd..563864d 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -453,14 +453,16 @@ void cpupool_rm_domain(struct domain *d)
 }
 
 /*
- * called to add a new cpu to pool admin
- * we add a hotplugged cpu to the cpupool0 to be able to add it to dom0,
- * unless we are resuming from S3, in which case we put the cpu back
- * in the cpupool it was in prior to suspend.
+ * Called to add a cpu to a pool. CPUs being hot-plugged are added to pool0,
+ * as they must have been in there when unplugged.
+ *
+ * If, on the other hand, we are adding CPUs because we are resuming (e.g.,
+ * after ACPI S3) we put the cpu back in the pool where it was in prior when
+ * we suspended.
  */
 static int cpupool_cpu_add(unsigned int cpu)
 {
-    int ret = -EINVAL;
+    int ret = 0;
 
     spin_lock(&cpupool_lock);
     cpumask_clear_cpu(cpu, &cpupool_locked_cpus);
@@ -478,12 +480,28 @@ static int cpupool_cpu_add(unsigned int cpu)
                 if ( ret )
                     goto out;
                 cpumask_clear_cpu(cpu, (*c)->cpu_suspended);
+                break;
             }
         }
-    }
 
-    if ( cpumask_test_cpu(cpu, &cpupool_free_cpus) )
+        /*
+         * Either cpu has been found as suspended in a pool, and added back
+         * there, or it stayed free (if it did not belong to any pool when
+         * suspending), and we don't want to do anything.
+         */
+        ASSERT(cpumask_test_cpu(cpu, &cpupool_free_cpus) ||
+               cpumask_test_cpu(cpu, (*c)->cpu_valid));
+    }
+    else
+    {
+        /*
+         * If we are not resuming, we are hot-plugging cpu, and in which case
+         * we add it to pool0, as it certainly was there when hot-unplagged
+         * (or unplugging would have failed) and that is the default behavior
+         * anyway.
+         */
         ret = cpupool_assign_cpu_locked(cpupool0, cpu);
+    }
  out:
     spin_unlock(&cpupool_lock);
 
@@ -491,29 +509,52 @@ static int cpupool_cpu_add(unsigned int cpu)
 }
 
 /*
- * called to remove a cpu from pool admin
- * the cpu to be removed is locked to avoid removing it from dom0
- * returns failure if not in pool0
+ * Called to remove a CPU from a pool. The CPU is locked, to forbid removing
+ * it from pool0. In fact, if we want to hot-unplug a CPU, it must belong to
+ * pool0, or we fail.
+ *
+ * However, if we are suspending (e.g., to ACPI S3), we mark the CPU in such
+ * a way that it can be put back in its pool when resuming.
  */
 static int cpupool_cpu_remove(unsigned int cpu)
 {
     int ret = -EBUSY;
-    struct cpupool **c;
 
     spin_lock(&cpupool_lock);
-    if ( cpumask_test_cpu(cpu, cpupool0->cpu_valid) )
-        ret = 0;
-    else
+    if ( system_state == SYS_STATE_suspend )
     {
+        struct cpupool **c;
+
         for_each_cpupool(c)
         {
-            if ( cpumask_test_cpu(cpu, (*c)->cpu_suspended ) )
+            if ( cpumask_test_cpu(cpu, (*c)->cpu_valid ) )
             {
-                ret = 0;
+                cpumask_set_cpu(cpu, (*c)->cpu_suspended);
                 break;
             }
         }
+
+        /*
+         * Either we found cpu in a pool, or it must be free (if it has been
+         * hot-unplagged, then we must have found it in pool0). It is, of
+         * course, fine to suspend or shutdown with CPUs not assigned to a
+         * pool, and (in case of suspend) they will stay free when resuming.
+         */
+        ASSERT(cpumask_test_cpu(cpu, &cpupool_free_cpus) ||
+               cpumask_test_cpu(cpu, (*c)->cpu_suspended));
+        ASSERT(cpumask_test_cpu(cpu, &cpu_online_map) ||
+               cpumask_test_cpu(cpu, cpupool0->cpu_suspended));
+        ret = 0;
+    }
+    else if ( cpumask_test_cpu(cpu, cpupool0->cpu_valid) )
+    {
+        /*
+         * If we are not suspending, we are hot-unplugging cpu, and that is
+         * allowed only for CPUs in pool0.
+         */
+        ret = 0;
     }
+
     if ( !ret )
         cpumask_set_cpu(cpu, &cpupool_locked_cpus);
     spin_unlock(&cpupool_lock);
@@ -709,15 +750,6 @@ static int cpu_callback(
     unsigned int cpu = (unsigned long)hcpu;
     int rc = 0;
 
-    if ( system_state == SYS_STATE_suspend )
-    {
-        struct cpupool **c;
-
-        for_each_cpupool(c)
-            if ( cpumask_test_cpu(cpu, (*c)->cpu_valid ) )
-                cpumask_set_cpu(cpu, (*c)->cpu_suspended);
-    }
-
     switch ( action )
     {
     case CPU_DOWN_FAILED:
--
generated by git-patchbot for /home/xen/git/xen.git#master

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.