[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 2/6] xen/rcu: don't use stop_machine_run() for rcu_barrier()



On 10.03.2020 17:34, Jürgen Groß wrote:
> On 10.03.20 17:29, Jan Beulich wrote:
>> On 10.03.2020 08:28, Juergen Gross wrote:
>>> +void rcu_barrier(void)
>>>   {
>>> -    atomic_t cpu_count = ATOMIC_INIT(0);
>>> -    return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
>>> +    unsigned int n_cpus;
>>> +
>>> +    while ( !get_cpu_maps() )
>>> +    {
>>> +        process_pending_softirqs();
>>> +        if ( !atomic_read(&cpu_count) )
>>> +            return;
>>> +
>>> +        cpu_relax();
>>> +    }
>>> +
>>> +    n_cpus = num_online_cpus();
>>> +
>>> +    if ( atomic_cmpxchg(&cpu_count, 0, n_cpus) == 0 )
>>> +    {
>>> +        atomic_add(n_cpus, &done_count);
>>> +        cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
>>> +    }
>>> +
>>> +    while ( atomic_read(&done_count) )
>>
>> Don't you leave a window for races here, in that done_count
>> gets set to non-zero only after setting cpu_count? A CPU
>> losing the cmpxchg attempt above may observe done_count
>> still being zero, and hence exit without waiting for the
>> count to actually _drop_ to zero.
> 
> This can only be a cpu not having joined the barrier handling, so it
> will do that later.

I'm afraid I don't understand - if two CPUs independently call
rcu_barrier(), neither should fall through here without waiting
at all, I would think?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.