[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()



On 17.02.20 14:47, Roger Pau Monné wrote:
On Mon, Feb 17, 2020 at 02:17:23PM +0100, Jürgen Groß wrote:
On 17.02.20 13:49, Roger Pau Monné wrote:
On Mon, Feb 17, 2020 at 01:32:59PM +0100, Jürgen Groß wrote:
On 17.02.20 13:17, Roger Pau Monné wrote:
On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:
On 17.02.20 12:49, Julien Grall wrote:
Hi Juergen,

On 17/02/2020 07:20, Juergen Gross wrote:
+void rcu_barrier(void)
     {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
+    if ( !atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) )

What does prevent the cpu_online_map to change under your feet?
Shouldn't you grab the lock via get_cpu_maps()?

Oh, indeed.

This in turn will require a modification of the logic to detect parallel
calls on multiple cpus.

If you pick my patch to turn that into a rw lock you shouldn't worry
about parallel calls I think, but the lock acquisition can still fail
if there's a CPU plug/unplug going on:

https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html

Thanks, but letting rcu_barrier() fail is a no go, so I still need to
handle that case (I mean the case failing to get the lock). And handling
of parallel calls is not needed to be functional correct, but to avoid
not necessary cpu synchronization (each parallel call detected can just
wait until the master has finished and then return).

BTW - The recursive spinlock today would allow for e.g. rcu_barrier() to
be called inside a CPU plug/unplug section. Your rwlock is removing that
possibility. Any chance that could be handled?

While this might be interesting for the rcu stuff, it certainly isn't
for other pieces also relying on the cpu maps lock.

Ie: get_cpu_maps must fail when called by send_IPI_mask if there's a
CPU plug/unplug operation going on, even if it's on the same pCPU
that's holding the lock in write mode.

Sure? How is cpu_down() working then?

send_IPI_mask failing to acquire the cpu maps lock prevents it from
using the APIC shorthand, which is what we want in that case.

It is calling stop_machine_run()
which is using send_IPI_mask()...

Xen should avoid using the APIC shorthand in that case, which I don't
think it's happening now, as the lock is recursive.

In fact the code area where this is true is much smaller than that
protected by the lock.

Basically only __cpu_disable() and __cpu_up() (on x86) are critical in
this regard.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.