[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH 0/5] xen: RCU: x86/ARM: Add support of rcu_idle_{enter, exit}
Hello everyone, It took some time, and, honestly, it was a bit of a nightmare, but I think I did manage to find a way to make our RCU implementation in a status where it does not only work "by chance" (as it does right now! :-O). A bit of background in this post: https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg105388.html About the actual patch series, apart from the first two preparatory patches (patch 1 was part of another series, but it's more convenient for me to send it in here), this is all about the following: *) making sure that CPUs that are idle when an RCU grace period is recorded to be starting, don't contribute to make such grace period longer than it needs to be. In fact, if they're idle, they can't hold references, and can safely be ignored. We do this in patch 3, using a cpumask, as Linux was doing on s390 (which was tickless already, back when we imported the RCU code), and on the early days of dynticks support on x86; *) making sure that CPUs that have RCU related stuff to do, don't just go idle without actually doing those. This is rather simple, we just have to actually start using a function that we did import, but isn't used or called right now: rcu_needs_cpu(). However, given that Xen is tickless (i.e., we stop all the timers when a CPU becomes idle), there's the problem of how to deal with the CPUs that have RCU callbacks queued. In fact, just using rcu_needs_cpu(), as we in fact do, means they'll spin, until the end of the grace period. Although it's correct, we clearly don't want this, and yet, in patch 4, we let that happen; *) find a solution for the problem that using rcu_needs_cpu() (which is _correct_ and _necessary_) poses, for CPUs with queued callbacks. This is what patch 5 does. Basically, we introduce some kind of periodic tick. This timer, however, is only started on those CPUs, and only when they enter an idle sleep. The idea is to make sure that, if they won't wake up soon enough (well, not that soon, period is 10 msec) by themselves, we'll poke them and give them a chance to check whether the grace period has actually ended, and invoke the callbacks. Without this series applied, this is the situation on x86, looking at when complete_domain_destroy() is invoked via call_rcu, and at when it is actually executed (printk added by me, for the sake of this experiment, they're not in the patches): Idle: (XEN) [ 238.758551] XXX calling complete_domain_destroy(d5), from CPU 8, at 238758535948 ns (XEN) [ 239.338958] XXX executing complete_domain_destroy(d5), from CPU 8, at 239338943633 ns With CPU load: (XEN) [ 322.740427] XXX calling complete_domain_destroy(d7), from CPU 13, at 322740410755 ns (XEN) [ 322.742598] XXX executing complete_domain_destroy(d7), from CPU 13, at 322742595289 ns When the system is idle, the delay between the call and the execution is significant (which is counterintuitive, as the system is idle!!). Moreover, there's a big difference in the behavior of the system, between the idle and the loaded case. With this series applied, here's what happen: Idle: (XEN) [ 106.341590] XXX calling complete_domain_destroy(d1), from CPU 9, at 106341573553 ns (XEN) [ 106.344391] XXX executing complete_domain_destroy(d1), from CPU 9, at 106344387574 ns With CPU load: (XEN) [ 176.166842] XXX calling complete_domain_destroy(d2), from CPU 13, at 176166826183 ns (XEN) [ 176.167571] XXX executing complete_domain_destroy(d2), from CPU 13, at 176167568269 ns Which I call much better! :-) This patch series addresses the XEN-27 issue, which I think Julien wants to consider a blocker for 4.10: https://xenproject.atlassian.net/browse/XEN-27 There is a git branch available, with the series in, here: git://xenbits.xen.org/people/dariof/xen.git rel/rcu/introduce-idle-enter-exit http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/rcu/introduce-idle-enter-exit https://travis-ci.org/fdario/xen/builds/258044027 And, finally, Stefano and Julien, I've compile tested, but have not runtime tested the patches on ARM. Thanks and Regards, Dario --- Dario Faggioli (5): xen: in do_softirq() sample smp_processor_id() once and for all. xen: ARM: suspend the tick (if in use) when going idle. xen: RCU/x86/ARM: discount CPUs that were idle when grace period started. xen: RCU: don't let a CPU with a callback go idle. xen: RCU: avoid busy waiting until the end of grace period. xen/arch/arm/domain.c | 33 ++++++++++++----- xen/arch/x86/acpi/cpu_idle.c | 31 +++++++++++----- xen/arch/x86/cpu/mwait-idle.c | 15 ++++++-- xen/arch/x86/domain.c | 8 ++++ xen/common/rcupdate.c | 80 +++++++++++++++++++++++++++++++++++++++-- xen/common/softirq.c | 8 +--- xen/include/xen/rcupdate.h | 6 +++ xen/include/xen/sched.h | 6 ++- 8 files changed, 153 insertions(+), 34 deletions(-) -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |