|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v1 2/3] xen/smp: Rewrite on_selected_cpus() to be lockless
On 01.04.2026 18:35, Ross Lagerwall wrote:
> --- a/xen/common/smp.c
> +++ b/xen/common/smp.c
> @@ -24,13 +24,15 @@
> /*
> * Structure and data for smp_call_function()/on_selected_cpus().
> */
> -static DEFINE_SPINLOCK(call_lock);
> -static struct call_data_struct {
> +struct call_data_struct {
> void (*func) (void *info);
> void *info;
> int wait;
> - cpumask_t selected;
> -} call_data;
> + cpumask_t selected __cacheline_aligned;
The adding of the alignment attribute isn't strictly required here, is it?
Can it, with its own justification, be split out?
However, wrt what I said in the first reply, putting a full cpumask_t in
per-CPU data is even a little worse than having one as a global. Imo this
also wants to become cpumask_var_t. Then the attribute wouldn't be quite
applicable anymore anyway.
> @@ -50,55 +52,84 @@ void on_selected_cpus(
> void *info,
> int wait)
> {
> + struct call_data_struct *data;
> + unsigned int cpu = smp_processor_id();
> +
> ASSERT(local_irq_is_enabled());
> ASSERT(cpumask_subset(selected, &cpu_online_map));
>
> - spin_lock(&call_lock);
> + if ( cpumask_empty(selected) )
> + return;
> +
> + data = &this_cpu(call_data);
>
> - cpumask_copy(&call_data.selected, selected);
> + if ( !data->wait )
> + {
> + /* Wait for any previous async call to complete */
> + while ( !cpumask_empty(&data->selected) )
> + cpu_relax();
> +
> + cpumask_clear_cpu(cpu, &tasks);
In the description you say "Track which CPUs are currently running
on_selected_cpus()", yet as per this the bit can remain set after the
function was left. Which isn't just an issue of describing things
correctly; there's also a performance concern: The IPI handler(s) will
need to carry out more work than necessary. That's not a lot of work (only
the subsequent cpumask_test_cpu() there), but also not nothing.
I also think another barrier is needed above here: We may only clear the
bit in tasks when the empty ->selected is globally visible. More generally,
since the bit in tasks is what everything derives from, barriers are
apparently needed around all of its updating / accessing. Which also meant
that ...
> + }
>
> - if ( cpumask_empty(&call_data.selected) )
> - goto out;
> + data->func = func;
> + data->info = info;
> + data->wait = wait;
>
> - call_data.func = func;
> - call_data.info = info;
> - call_data.wait = wait;
> + smp_wmb();
... besides (as already indicated) this barrier needing to move ...
> - smp_send_call_function_mask(&call_data.selected);
> + cpumask_copy(&data->selected, selected);
... here, I think that another one is going to be needed ...
> - while ( !cpumask_empty(&call_data.selected) )
> - cpu_relax();
> + cpumask_set_cpu(cpu, &tasks);
... here, such that ...
> -out:
> - spin_unlock(&call_lock);
> + smp_send_call_function_mask(&data->selected);
... upon receipt of the IPI the target sees the up-to-date value.
> + if ( wait )
> + {
> + while ( !cpumask_empty(&data->selected) )
> + cpu_relax();
> +
> + cpumask_clear_cpu(cpu, &tasks);
> + }
> }
>
> void smp_call_function_interrupt(void)
> {
> - void (*func)(void *info) = call_data.func;
> - void *info = call_data.info;
> unsigned int cpu = smp_processor_id();
> -
> - if ( !cpumask_test_cpu(cpu, &call_data.selected) )
> - return;
> + unsigned int i;
> + struct call_data_struct *data;
> + void (*func)(void *info);
> + void *info;
>
> irq_enter();
>
> - if ( unlikely(!func) )
> - {
> - cpumask_clear_cpu(cpu, &call_data.selected);
> - }
> - else if ( call_data.wait )
> - {
> - (*func)(info);
> - smp_mb();
> - cpumask_clear_cpu(cpu, &call_data.selected);
> - }
> - else
> + for_each_cpu ( i, &tasks )
> {
> - smp_mb();
> - cpumask_clear_cpu(cpu, &call_data.selected);
> - (*func)(info);
> + data = &per_cpu(call_data, i);
> +
> + if ( !cpumask_test_cpu(cpu, &data->selected) )
> + continue;
> +
> + smp_rmb();
> + func = data->func;
> + info = data->info;
> +
> + if ( unlikely(!func) )
> + {
> + cpumask_clear_cpu(cpu, &data->selected);
> + }
> + else if ( data->wait )
> + {
> + (*func)(info);
> + smp_mb();
> + cpumask_clear_cpu(cpu, &data->selected);
> + }
> + else
> + {
> + smp_mb();
> + cpumask_clear_cpu(cpu, &data->selected);
> + (*func)(info);
I understand you only re-indent this code, but I'm struggling with the
purpose of the barrier here. With the smp_rmb() above there are no reads
to isolate (data->func and data->info can't change while the bit in
data->selected is still set). And there are no earlier writes at all,
unless anything done outside of the interrupt handler would matter.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |