[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH v2] xen: cpupools: avoid crashing if shutting down with free CPUs
in fact, before this change, shutting down or suspending the system with some CPUs not assigned to any cpupool, would crash as follows: (XEN) Xen call trace: (XEN) [<ffff82d080101757>] disable_nonboot_cpus+0xb5/0x138 (XEN) [<ffff82d0801a8824>] enter_state_helper+0xbd/0x369 (XEN) [<ffff82d08010614a>] continue_hypercall_tasklet_handler+0x4a/0xb1 (XEN) [<ffff82d0801320bd>] do_tasklet_work+0x78/0xab (XEN) [<ffff82d0801323f3>] do_tasklet+0x5e/0x8a (XEN) [<ffff82d080163cb6>] idle_loop+0x56/0x6b (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Xen BUG at cpu.c:191 (XEN) **************************************** This is because, for free CPUs, -EBUSY were being returned when trying to tear them down, making cpu_down() unhappy. It is certainly unpractical to forbid shutting down or suspenging if there are unassigned CPUs, so this change fixes the above by just avoiding returning -EBUSY for those CPUs. If shutting off, that does not matter much anyway. If suspending, we make sure that the CPUs remain unassigned when resuming. While there, take the chance to: - fix the doc comment of cpupool_cpu_remove() (it was wrong); - improve comments in general around and in cpupool_cpu_remove() and cpupool_cpu_add(); - add a couple of ASSERT()-s for checking consistency. Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx> Cc: Juergen Gross <jgross@xxxxxxxx> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> Cc: Jan Beulich <jbeulich@xxxxxxxx> Cc: Tim Deegan <tim@xxxxxxx> Cc: Keir Fraser <keir@xxxxxxx> Reviewed-by: Juergen Gross <jgross@xxxxxxxx> Tested-by: Juergen Gross <jgross@xxxxxxxx> --- Changes from v1: * ret initialized to 0 in cpupool_cpu_add(), to fix a bug in the suspend/resume case, identified by Juergen. --- xen/common/cpupool.c | 82 +++++++++++++++++++++++++++++++++++--------------- 1 file changed, 57 insertions(+), 25 deletions(-) diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c index cabaccd..563864d 100644 --- a/xen/common/cpupool.c +++ b/xen/common/cpupool.c @@ -453,14 +453,16 @@ void cpupool_rm_domain(struct domain *d) } /* - * called to add a new cpu to pool admin - * we add a hotplugged cpu to the cpupool0 to be able to add it to dom0, - * unless we are resuming from S3, in which case we put the cpu back - * in the cpupool it was in prior to suspend. + * Called to add a cpu to a pool. CPUs being hot-plugged are added to pool0, + * as they must have been in there when unplugged. + * + * If, on the other hand, we are adding CPUs because we are resuming (e.g., + * after ACPI S3) we put the cpu back in the pool where it was in prior when + * we suspended. */ static int cpupool_cpu_add(unsigned int cpu) { - int ret = -EINVAL; + int ret = 0; spin_lock(&cpupool_lock); cpumask_clear_cpu(cpu, &cpupool_locked_cpus); @@ -478,12 +480,28 @@ static int cpupool_cpu_add(unsigned int cpu) if ( ret ) goto out; cpumask_clear_cpu(cpu, (*c)->cpu_suspended); + break; } } - } - if ( cpumask_test_cpu(cpu, &cpupool_free_cpus) ) + /* + * Either cpu has been found as suspended in a pool, and added back + * there, or it stayed free (if it did not belong to any pool when + * suspending), and we don't want to do anything. + */ + ASSERT(cpumask_test_cpu(cpu, &cpupool_free_cpus) || + cpumask_test_cpu(cpu, (*c)->cpu_valid)); + } + else + { + /* + * If we are not resuming, we are hot-plugging cpu, and in which case + * we add it to pool0, as it certainly was there when hot-unplagged + * (or unplugging would have failed) and that is the default behavior + * anyway. + */ ret = cpupool_assign_cpu_locked(cpupool0, cpu); + } out: spin_unlock(&cpupool_lock); @@ -491,29 +509,52 @@ static int cpupool_cpu_add(unsigned int cpu) } /* - * called to remove a cpu from pool admin - * the cpu to be removed is locked to avoid removing it from dom0 - * returns failure if not in pool0 + * Called to remove a CPU from a pool. The CPU is locked, to forbid removing + * it from pool0. In fact, if we want to hot-unplug a CPU, it must belong to + * pool0, or we fail. + * + * However, if we are suspending (e.g., to ACPI S3), we mark the CPU in such + * a way that it can be put back in its pool when resuming. */ static int cpupool_cpu_remove(unsigned int cpu) { int ret = -EBUSY; - struct cpupool **c; spin_lock(&cpupool_lock); - if ( cpumask_test_cpu(cpu, cpupool0->cpu_valid) ) - ret = 0; - else + if ( system_state == SYS_STATE_suspend ) { + struct cpupool **c; + for_each_cpupool(c) { - if ( cpumask_test_cpu(cpu, (*c)->cpu_suspended ) ) + if ( cpumask_test_cpu(cpu, (*c)->cpu_valid ) ) { - ret = 0; + cpumask_set_cpu(cpu, (*c)->cpu_suspended); break; } } + + /* + * Either we found cpu in a pool, or it must be free (if it has been + * hot-unplagged, then we must have found it in pool0). It is, of + * course, fine to suspend or shutdown with CPUs not assigned to a + * pool, and (in case of suspend) they will stay free when resuming. + */ + ASSERT(cpumask_test_cpu(cpu, &cpupool_free_cpus) || + cpumask_test_cpu(cpu, (*c)->cpu_suspended)); + ASSERT(cpumask_test_cpu(cpu, &cpu_online_map) || + cpumask_test_cpu(cpu, cpupool0->cpu_suspended)); + ret = 0; + } + else if ( cpumask_test_cpu(cpu, cpupool0->cpu_valid) ) + { + /* + * If we are not suspending, we are hot-unplugging cpu, and that is + * allowed only for CPUs in pool0. + */ + ret = 0; } + if ( !ret ) cpumask_set_cpu(cpu, &cpupool_locked_cpus); spin_unlock(&cpupool_lock); @@ -709,15 +750,6 @@ static int cpu_callback( unsigned int cpu = (unsigned long)hcpu; int rc = 0; - if ( system_state == SYS_STATE_suspend ) - { - struct cpupool **c; - - for_each_cpupool(c) - if ( cpumask_test_cpu(cpu, (*c)->cpu_valid ) ) - cpumask_set_cpu(cpu, (*c)->cpu_suspended); - } - switch ( action ) { case CPU_DOWN_FAILED: _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |