So, I've had a look (but only a quick one).

If we want to do something specific within the domain destruction path,
we can add an rcu_barrier() there (I mean in domain_destroy()).
However, that does not feel right either. Also, how can we be sure that
the CPU never going through idle (as far as Xen knows, at least), isn't
going to be problem for other RCU calls as well?

Another thing that we can do is to act on the parameters that control
the threshold which decides when a quiescent state is forced. This was
basically what Julien was suggesting, but I still would avoid to do
that always.

So, basically, in this hackish patch attached, I added a new boot
command line argument, called rcu_force_quiesc. If set to true,
thresholds are set so that quiescence is always forced at each
invocation of call_rcu(). And even if the new param is not explicitly
specified, I do tweak the threshold when "wfi=native" is.

Milan, can you apply this patch, add "wfi=native" again, and re-test?
If it works, we'll decide what to do next.

E.g., we can expose the RCU threshold via the appropriate set of boot
time parameters --like Linux, from where this code comes, did/does--
and document how they should be set, if one wants to use "wfi=native".

