|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v7 2/5] xen/rcu: don't use stop_machine_run() for rcu_barrier()
On 26.03.2020 08:24, Jürgen Groß wrote:
> On 26.03.20 07:58, Jan Beulich wrote:
>> On 25.03.2020 17:13, Julien Grall wrote:
>>> On 25/03/2020 10:55, Juergen Gross wrote:
>>>> @@ -143,51 +143,90 @@ static int qhimark = 10000;
>>>> static int qlowmark = 100;
>>>> static int rsinterval = 1000;
>>>> -struct rcu_barrier_data {
>>>> - struct rcu_head head;
>>>> - atomic_t *cpu_count;
>>>> -};
>>>> +/*
>>>> + * rcu_barrier() handling:
>>>> + * Two counters are used to synchronize rcu_barrier() work:
>>>> + * - cpu_count holds the number of cpus required to finish barrier
>>>> handling.
>>>> + * It is decremented by each cpu when it has performed all pending rcu
>>>> calls.
>>>> + * - pending_count shows whether any rcu_barrier() activity is running and
>>>> + * it is used to synchronize leaving rcu_barrier() only after all cpus
>>>> + * have finished their processing. pending_count is initialized to
>>>> nr_cpus + 1
>>>> + * and it is decremented by each cpu when it has seen that cpu_count has
>>>> + * reached 0. The cpu where rcu_barrier() has been called will wait
>>>> until
>>>> + * pending_count has been decremented to 1 (so all cpus have seen
>>>> cpu_count
>>>> + * reaching 0) and will then set pending_count to 0 indicating there is
>>>> no
>>>> + * rcu_barrier() running.
>>>> + * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded
>>>> to
>>>> + * be active if pending_count is not zero. In case rcu_barrier() is
>>>> called on
>>>> + * multiple cpus it is enough to check for pending_count being not zero
>>>> on entry
>>>> + * and to call process_pending_softirqs() in a loop until pending_count
>>>> drops to
>>>> + * zero, before starting the new rcu_barrier() processing.
>>>> + */
>>>> +static atomic_t cpu_count = ATOMIC_INIT(0);
>>>> +static atomic_t pending_count = ATOMIC_INIT(0);
>>>> static void rcu_barrier_callback(struct rcu_head *head)
>>>> {
>>>> - struct rcu_barrier_data *data = container_of(
>>>> - head, struct rcu_barrier_data, head);
>>>> - atomic_inc(data->cpu_count);
>>>> + smp_mb__before_atomic(); /* Make all writes visible to other
>>>> cpus. */
>>>
>>> smp_mb__before_atomic() will order both read and write. However, the
>>> comment suggest only the write are required to be ordered.
>>>
>>> So either the barrier is too strong or the comment is incorrect. Can
>>> you clarify it?
>>
>> Neither is the case, I guess: There simply is no smp_wmb__before_atomic()
>> in Linux, and if we want to follow their model we shouldn't have one
>> either. I'd rather take the comment to indicate that if one appeared, it
>> could be used here.
>
> Right. Currently we have the choice of either using
> smp_mb__before_atomic() which is too strong for Arm, or smp_wmb() which
> is too strong for x86.
For x86 smp_wmb() is actually only very slightly too strong - it expands
to just barrier(), after all. So overall perhaps that's the better
choice here (with a suitable comment)?
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |