[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] null scheduler bug

On Tue, 2018-09-25 at 19:49 +0200, Dario Faggioli wrote:
> [Adding a few people to the Cc-list. See below...]
> On Tue, 2018-09-25 at 12:15 +0100, Julien Grall wrote:
> > On 09/25/2018 10:02 AM, Dario Faggioli wrote:
> > > On Mon, 2018-09-24 at 22:46 +0100, Julien Grall wrote:
> > > > 
> My knowledge of RCU themselves would need refreshing, though. I
> managed
> to getbecome reasonably familiar with how the implementation we
> imported works back then, when working on the said issue, but I guess
> I
> better go check the code again.
> I'm Cc-ing the people that have reviewed the patches and helping with
> the idle timer problem, in case anyone has bright ideas out of the
> top
> of his head.
> Perhaps we should "just" get away from using RCU for domain
> destruction
> (but I'm just tossing the idea around, without much consideration
> about
> whether it's the right solution, or about how hard/feasible it really
> is).
> Or maybe we can still use the timer, in some special way, if we have
> wfi=native (or equivalent)...
So, I've had a look (but only a quick one).

If we want to do something specific within the domain destruction path,
we can add an rcu_barrier() there (I mean in domain_destroy()).
However, that does not feel right either. Also, how can we be sure that
the CPU never going through idle (as far as Xen knows, at least), isn't
going to be problem for other RCU calls as well?

Another thing that we can do is to act on the parameters that control
the threshold which decides when a quiescent state is forced. This was
basically what Julien was suggesting, but I still would avoid to do
that always.

So, basically, in this hackish patch attached, I added a new boot
command line argument, called rcu_force_quiesc. If set to true,
thresholds are set so that quiescence is always forced at each
invocation of call_rcu(). And even if the new param is not explicitly
specified, I do tweak the threshold when "wfi=native" is.

Milan, can you apply this patch, add "wfi=native" again, and re-test?
If it works, we'll decide what to do next.

E.g., we can expose the RCU threshold via the appropriate set of boot
time parameters --like Linux, from where this code comes, did/does--
and document how they should be set, if one wants to use "wfi=native".

Thanks and Regards,
<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: rcu-quiesc-patch.patch
Description: Text Data

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.