[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 0/6] x86: fixes/improvements for scratch cpumask



On 18.02.20 11:26, Andrew Cooper wrote:
On 18/02/2020 07:40, Jürgen Groß wrote:
On 17.02.20 19:43, Roger Pau Monne wrote:
Hello,

Commit:

5500d265a2a8fa63d60c08beb549de8ec82ff7a5
x86/smp: use APIC ALLBUT destination shorthand when possible

Introduced a bogus usage of the scratch cpumask: it was used in a
function that could be called from interrupt context, and hence using
the scratch cpumask there is not safe. Patch #5 is a fix for that usage,
together with also preventing the usage of any per-CPU variables when
send_IPI_mask is called from #MC or #NMI context. Previous patches are
preparatory changes.

Patch #6 adds some debug infrastructure to make sure the scratch cpumask
is used in the right context, and hence should prevent further missuses.

I wonder whether it wouldn't be better to have a common percpu scratch
cpumask handling instead of introducing local ones all over the
hypervisor.

So basically an array of percpu cpumasks allocated when bringing up a
cpu (this spares memory as the masks wouldn't need to cover NR_CPUS
cpus), a percpu counter of the next free index and get_ and put_
functions acting in a lifo manner.

This would help removing all the still existing cpumasks on the stack
and any illegal nesting would be avoided. The only remaining question
would be the size of the array.

Thoughts?

I like the approach, but there is a major caveat.

It is certainly problematic that we have both cpumask_scratch and
scratch_cpumask with have different rules for how to use safely, and now
we're gaining custom logic to fix up the recursive safety of one of them.

That said, I'm pretty sure it will be x86 specific, because the safety
of this is tied to using per-pcpu stacks rather than per-vcpu stacks.
The only reason the scheduler cpumasks are safe for use on ARM is
because the scheduler code which uses them doesn't call schedule() in
the middle of use.

No, the reason the scheduler cpumasks are safe is that using one of
those requires to take the scheduler lock of the cpu having the mask in
its percpu data.

That restriction could probably be dropped in case the scheduler would
be using the common infrastructure.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.