[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Power aware credit scheduler
Existing credit scheduler is not power aware. To achieve better power saving ability with negligible performance impact, following areas may be tweaked and listed here for comments first. Goal is not to silly save power with sacrifice of performance, e.g. we don't want to prevent migration when there're free cpus with some pending runqueues. But when free computing power is more than existing requirement, power aware policy can be pushed to choose a less power-intrusive decision. Of course even in latter case, it's controllable with a scheduler parameter like csched_private.power and exposed to user. ---- a) when there's more idle cpus than required a.1) csched_cpu_pick Existing policy is to pick one with more idle neighbours, to avoid shared resource contention among cores or threads. However from power P.O.V, package C-state saves much more power than per-core C-state vehicle. From this angle, it might be better to keep idle package continuously idle, while picking idle cores/threads with busy neighbours already, if csched_private. power is set. The performance/watt ratio is positively incremented though absolute performance is kicked a bit. a.2) csched_vcpu_wake Similar as above, instead of blindly kick all idle cpus in a rush, some selective knock can be pushed with power factor concerned. ---- b) when physical cpu resides in idle C-state Avoid unnecessary work to keep longer C-state residency. For example, accouting process (tick timer, more specifically) can be stopped before C-state entrance and then resumed after waking up. The point is that no accounting is required when current cpu is idle, and any runqueue change triggering from other cpus incurs a IPI to this cpu which effectively breaks it back to C0 state with accounting resumed. Since the residency period may be longer than accouting period (30ms), csched_tick should be aware of resume event to adjust elapsed credits. ---- c) when cpu's freq is scaled dynamically When cpufreq/Px is enabled, cpu's frequency is adjusted to different operation points driven by a on-demand governor. So csched_acct may need take frequency difference among cpus into consideration and total available credits won't be a simple 300 * online cpu_number. ---- Of course there're bunch of research areas to add more power factor into scheduler policy. But above is fundamental stuff which we believe would help scheduler understand power requirement and not incurs bad impact to performance/watt first. Comments are appreciated. Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |