On 17.02.20 15:23, Igor Druzhinin wrote:
On 17/02/2020 12:30, Igor Druzhinin wrote:
On 17/02/2020 12:28, Jürgen Groß wrote:
On 17.02.20 13:26, Igor Druzhinin wrote:
On 17/02/2020 07:20, Juergen Gross wrote:
Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Finally switch rcu_barrier() to return void as it now can never fail.

Would this implementation guarantee progress as previous implementation


Thanks, I'll put it to test today to see if it solves our use case.

Just manually tried it - gives infinite (up to stack size) trace like:

(XEN) [    1.496520]    [<ffff82d08022e435>] F softirq.c#__do_softirq+0x85/0x90
(XEN) [    1.496561]    [<ffff82d08022e475>] F 
(XEN) [    1.496600]    [<ffff82d080221101>] F 
(XEN) [    1.496643]    [<ffff82d08022e435>] F softirq.c#__do_softirq+0x85/0x90
(XEN) [    1.496685]    [<ffff82d08022e475>] F 
(XEN) [    1.496726]    [<ffff82d080221101>] F 
(XEN) [    1.496766]    [<ffff82d08022e435>] F softirq.c#__do_softirq+0x85/0x90
(XEN) [    1.496806]    [<ffff82d08022e475>] F 
(XEN) [    1.496847]    [<ffff82d080221101>] F 
(XEN) [    1.496887]    [<ffff82d08022e435>] F softirq.c#__do_softirq+0x85/0x90
(XEN) [    1.496927]    [<ffff82d08022e475>] F 

Interesting I didn't run into this problem. Obviously I managed to
forget handling the case of recursion.


