Xen project Mailing List

Re: [Xen-devel] [PATCH] smpboot: Add smpboot state variables instead of reusing CPU hotplug states

To: Daniel Wagner <daniel.wagner@xxxxxxxxxxxx>

From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>

Date: Fri, 6 Nov 2015 23:17:23 -0800

Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx

Delivery-date: Sat, 07 Nov 2015 07:17:59 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Thu, Oct 15, 2015 at 01:32:44PM +0200, Daniel Wagner wrote: > The cpu hotplug state machine in smpboot.c is reusing the states from > cpu.h. That is confusing when it comes to the CPU_DEAD_FROZEN usage. > Paul explained to me that he was in need of an additional state > for destinguishing between a CPU error states. For this he just > picked CPU_DEAD_FROZEN. > > 8038dad7e888581266c76df15d70ca457a3c5910 smpboot: Add common code for > notification from dying CPU > 2a442c9c6453d3d043dfd89f2e03a1deff8a6f06 x86: Use common > outgoing-CPU-notification code > > Instead of reusing the states, let's add new definition inside > the smpboot.c file with explenation what those states > mean. Thanks Paul for providing them. > > Signed-off-by: Daniel Wagner <daniel.wagner@xxxxxxxxxxxx> Apologies for the delay, I didn't realize that you were waiting on me. Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: linux-kernel@xxxxxxxxxxxxxxx > --- > arch/x86/xen/smp.c | 4 +-- > include/linux/cpu.h | 3 +- > kernel/smpboot.c | 82 > ++++++++++++++++++++++++++++++++++++++++------------- > 3 files changed, 67 insertions(+), 22 deletions(-) > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index 3f4ebf0..804bf5c 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -495,7 +495,7 @@ static int xen_cpu_up(unsigned int cpu, struct > task_struct *idle) > rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL); > BUG_ON(rc); > > - while (cpu_report_state(cpu) != CPU_ONLINE) > + while (!cpu_check_online(cpu)) > HYPERVISOR_sched_op(SCHEDOP_yield, NULL); > > return 0; > @@ -767,7 +767,7 @@ static int xen_hvm_cpu_up(unsigned int cpu, struct > task_struct *tidle) > * This can happen if CPU was offlined earlier and > * offlining timed out in common_cpu_die(). > */ > - if (cpu_report_state(cpu) == CPU_DEAD_FROZEN) { > + if (cpu_check_timeout(cpu)) { > xen_smp_intr_free(cpu); > xen_uninit_lock_cpu(cpu); > } > diff --git a/include/linux/cpu.h b/include/linux/cpu.h > index 23c30bd..f78ab46 100644 > --- a/include/linux/cpu.h > +++ b/include/linux/cpu.h > @@ -284,7 +284,8 @@ void arch_cpu_idle_dead(void); > > DECLARE_PER_CPU(bool, cpu_dead_idle); > > -int cpu_report_state(int cpu); > +int cpu_check_online(int cpu); > +int cpu_check_timeout(int cpu); > int cpu_check_up_prepare(int cpu); > void cpu_set_state_online(int cpu); > #ifdef CONFIG_HOTPLUG_CPU > diff --git a/kernel/smpboot.c b/kernel/smpboot.c > index a818cbc..75e5724 100644 > --- a/kernel/smpboot.c > +++ b/kernel/smpboot.c > @@ -371,19 +371,63 @@ int smpboot_update_cpumask_percpu_thread(struct > smp_hotplug_thread *plug_thread, > } > EXPORT_SYMBOL_GPL(smpboot_update_cpumask_percpu_thread); > > +/* The CPU is offline, and its last offline operation was > + * successful and proceeded normally. (Or, alternatively, the > + * CPU never has come online, as this is the initial state.) > + */ > +#define CPUHP_POST_DEAD 0x01 > + > +/* The CPU is in the process of coming online. > + * Simple architectures can skip this state, and just invoke > + * cpu_set_state_online() unconditionally instead. > + */ > +#define CPUHP_UP_PREPARE 0x02 > + > +/* The CPU is now online. Simple architectures can skip this > + * state, and just invoke cpu_wait_death() and cpu_report_death() > + * unconditionally instead. > + */ > +#define CPUHP_ONLINE 0x03 > + > +/* The CPU has gone offline, so that it may now be safely > + * powered off (or whatever the architecture needs to do to it). > + */ > +#define CPUHP_DEAD 0x04 > + > +/* The CPU did not go offline in a timely fashion, if at all, > + * so it might need special processing at the next online (for > + * example, simply refusing to bring it online). > + */ > +#define CPUHP_BROKEN 0x05 > + > +/* The CPU eventually did go offline, but not in a timely > + * fashion. If some sort of reset operation is required before it > + * can be brought online, that reset operation needs to be carried > + * out at online time. (Or, again, the architecture might simply > + * refuse to bring it online.) > + */ > +#define CPUHP_TIMEOUT 0x06 > + > static DEFINE_PER_CPU(atomic_t, cpu_hotplug_state) = > ATOMIC_INIT(CPU_POST_DEAD); > > /* > * Called to poll specified CPU's state, for example, when waiting for > * a CPU to come online. > */ > -int cpu_report_state(int cpu) > +int cpu_check_online(int cpu) > +{ > + return atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == > + CPUHP_ONLINE; > +} > + > +int cpu_check_timeout(int cpu) > { > - return atomic_read(&per_cpu(cpu_hotplug_state, cpu)); > + return atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == > + CPUHP_TIMEOUT; > } > > /* > - * If CPU has died properly, set its state to CPU_UP_PREPARE and > + * If CPU has died properly, set its state to CPUHP_UP_PREPARE and > * return success. Otherwise, return -EBUSY if the CPU died after > * cpu_wait_death() timed out. And yet otherwise again, return -EAGAIN > * if cpu_wait_death() timed out and the CPU still hasn't gotten around > @@ -397,19 +441,19 @@ int cpu_report_state(int cpu) > int cpu_check_up_prepare(int cpu) > { > if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) { > - atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPU_UP_PREPARE); > + atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPUHP_UP_PREPARE); > return 0; > } > > switch (atomic_read(&per_cpu(cpu_hotplug_state, cpu))) { > > - case CPU_POST_DEAD: > + case CPUHP_POST_DEAD: > > /* The CPU died properly, so just start it up again. */ > - atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPU_UP_PREPARE); > + atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPUHP_UP_PREPARE); > return 0; > > - case CPU_DEAD_FROZEN: > + case CPUHP_TIMEOUT: > > /* > * Timeout during CPU death, so let caller know. > @@ -424,7 +468,7 @@ int cpu_check_up_prepare(int cpu) > */ > return -EBUSY; > > - case CPU_BROKEN: > + case CPUHP_BROKEN: > > /* > * The most likely reason we got here is that there was > @@ -452,7 +496,7 @@ int cpu_check_up_prepare(int cpu) > */ > void cpu_set_state_online(int cpu) > { > - (void)atomic_xchg(&per_cpu(cpu_hotplug_state, cpu), CPU_ONLINE); > + (void)atomic_xchg(&per_cpu(cpu_hotplug_state, cpu), CPUHP_ONLINE); > } > > #ifdef CONFIG_HOTPLUG_CPU > @@ -470,12 +514,12 @@ bool cpu_wait_death(unsigned int cpu, int seconds) > might_sleep(); > > /* The outgoing CPU will normally get done quite quickly. */ > - if (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == CPU_DEAD) > + if (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == CPUHP_DEAD) > goto update_state; > udelay(5); > > /* But if the outgoing CPU dawdles, wait increasingly long times. */ > - while (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) != CPU_DEAD) { > + while (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) != CPUHP_DEAD) { > schedule_timeout_uninterruptible(sleep_jf); > jf_left -= sleep_jf; > if (jf_left <= 0) > @@ -484,14 +528,14 @@ bool cpu_wait_death(unsigned int cpu, int seconds) > } > update_state: > oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu)); > - if (oldstate == CPU_DEAD) { > + if (oldstate == CPUHP_DEAD) { > /* Outgoing CPU died normally, update state. */ > smp_mb(); /* atomic_read() before update. */ > - atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPU_POST_DEAD); > + atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPUHP_POST_DEAD); > } else { > /* Outgoing CPU still hasn't died, set state accordingly. */ > if (atomic_cmpxchg(&per_cpu(cpu_hotplug_state, cpu), > - oldstate, CPU_BROKEN) != oldstate) > + oldstate, CPUHP_BROKEN) != oldstate) > goto update_state; > ret = false; > } > @@ -502,7 +546,7 @@ update_state: > * Called by the outgoing CPU to report its successful death. Return > * false if this report follows the surviving CPU's timing out. > * > - * A separate "CPU_DEAD_FROZEN" is used when the surviving CPU > + * A separate "CPUHP_TIMEOUT" is used when the surviving CPU > * timed out. This approach allows architectures to omit calls to > * cpu_check_up_prepare() and cpu_set_state_online() without defeating > * the next cpu_wait_death()'s polling loop. > @@ -515,13 +559,13 @@ bool cpu_report_death(void) > > do { > oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu)); > - if (oldstate != CPU_BROKEN) > - newstate = CPU_DEAD; > + if (oldstate != CPUHP_BROKEN) > + newstate = CPUHP_DEAD; > else > - newstate = CPU_DEAD_FROZEN; > + newstate = CPUHP_TIMEOUT; > } while (atomic_cmpxchg(&per_cpu(cpu_hotplug_state, cpu), > oldstate, newstate) != oldstate); > - return newstate == CPU_DEAD; > + return newstate == CPUHP_DEAD; > } > > #endif /* #ifdef CONFIG_HOTPLUG_CPU */ > -- > 2.4.3 > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.