[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v7 2/5] xen/rcu: don't use stop_machine_run() for rcu_barrier()
On 26.03.2020 08:24, Jürgen Groß wrote: > On 26.03.20 07:58, Jan Beulich wrote: >> On 25.03.2020 17:13, Julien Grall wrote: >>> On 25/03/2020 10:55, Juergen Gross wrote: >>>> @@ -143,51 +143,90 @@ static int qhimark = 10000; >>>> static int qlowmark = 100; >>>> static int rsinterval = 1000; >>>> -struct rcu_barrier_data { >>>> - struct rcu_head head; >>>> - atomic_t *cpu_count; >>>> -}; >>>> +/* >>>> + * rcu_barrier() handling: >>>> + * Two counters are used to synchronize rcu_barrier() work: >>>> + * - cpu_count holds the number of cpus required to finish barrier >>>> handling. >>>> + * It is decremented by each cpu when it has performed all pending rcu >>>> calls. >>>> + * - pending_count shows whether any rcu_barrier() activity is running and >>>> + * it is used to synchronize leaving rcu_barrier() only after all cpus >>>> + * have finished their processing. pending_count is initialized to >>>> nr_cpus + 1 >>>> + * and it is decremented by each cpu when it has seen that cpu_count has >>>> + * reached 0. The cpu where rcu_barrier() has been called will wait >>>> until >>>> + * pending_count has been decremented to 1 (so all cpus have seen >>>> cpu_count >>>> + * reaching 0) and will then set pending_count to 0 indicating there is >>>> no >>>> + * rcu_barrier() running. >>>> + * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded >>>> to >>>> + * be active if pending_count is not zero. In case rcu_barrier() is >>>> called on >>>> + * multiple cpus it is enough to check for pending_count being not zero >>>> on entry >>>> + * and to call process_pending_softirqs() in a loop until pending_count >>>> drops to >>>> + * zero, before starting the new rcu_barrier() processing. >>>> + */ >>>> +static atomic_t cpu_count = ATOMIC_INIT(0); >>>> +static atomic_t pending_count = ATOMIC_INIT(0); >>>> static void rcu_barrier_callback(struct rcu_head *head) >>>> { >>>> - struct rcu_barrier_data *data = container_of( >>>> - head, struct rcu_barrier_data, head); >>>> - atomic_inc(data->cpu_count); >>>> + smp_mb__before_atomic(); /* Make all writes visible to other >>>> cpus. */ >>> >>> smp_mb__before_atomic() will order both read and write. However, the >>> comment suggest only the write are required to be ordered. >>> >>> So either the barrier is too strong or the comment is incorrect. Can >>> you clarify it? >> >> Neither is the case, I guess: There simply is no smp_wmb__before_atomic() >> in Linux, and if we want to follow their model we shouldn't have one >> either. I'd rather take the comment to indicate that if one appeared, it >> could be used here. > > Right. Currently we have the choice of either using > smp_mb__before_atomic() which is too strong for Arm, or smp_wmb() which > is too strong for x86. For x86 smp_wmb() is actually only very slightly too strong - it expands to just barrier(), after all. So overall perhaps that's the better choice here (with a suitable comment)? Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |