[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 2/6] xen/rcu: don't use stop_machine_run() for rcu_barrier()
On 10.03.20 17:37, Jan Beulich wrote: On 10.03.2020 17:34, Jürgen Groß wrote:On 10.03.20 17:29, Jan Beulich wrote:On 10.03.2020 08:28, Juergen Gross wrote:+void rcu_barrier(void) { - atomic_t cpu_count = ATOMIC_INIT(0); - return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS); + unsigned int n_cpus; + + while ( !get_cpu_maps() ) + { + process_pending_softirqs(); + if ( !atomic_read(&cpu_count) ) + return; + + cpu_relax(); + } + + n_cpus = num_online_cpus(); + + if ( atomic_cmpxchg(&cpu_count, 0, n_cpus) == 0 ) + { + atomic_add(n_cpus, &done_count); + cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ); + } + + while ( atomic_read(&done_count) )Don't you leave a window for races here, in that done_count gets set to non-zero only after setting cpu_count? A CPU losing the cmpxchg attempt above may observe done_count still being zero, and hence exit without waiting for the count to actually _drop_ to zero.This can only be a cpu not having joined the barrier handling, so it will do that later.I'm afraid I don't understand - if two CPUs independently call rcu_barrier(), neither should fall through here without waiting at all, I would think? Oh, good catch! I have thought more about this problem and I think using counters only for doing rendezvous accounting is rather risky. I'll have a try using a cpumask instead. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |