[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/9] Porting the intel_pstate driver to Xen



On 23/04/2015 15:27, Jan Beulich wrote:
> >>> On 24.04.15 at 07:12, <wei.w.wang@xxxxxxxxx> wrote:
> > On 23/04/2015 22:09, Jan Beulich wrote:
> >> >>> On 23.04.16 at 15:31, <wei.w.wang@xxxxxxxxx> wrote:
> >> > The intel_pstate.c file under xen/arch/x86/acpi/cpufreq/ contains
> >> > all the logic for selecting the current P-state. It follows its
> >> > implementation in the kernel. Instead of using the traditional
> >> > cpufreq governors, intel_pstate implements its internal governor in
> >> > the "setpolicy()".
> >>
> >> And this internal governor behaves how? Like ondemand, powersave,
> >> peerformance, or yet something else? And how would its behavior be
> >> changed?
> >
> > In the kenel intel_pstate implementation, they have two internal governors:
> > Powersave and Performance.
> > Powersave is similar to the old (cpufreq) ondemand governor. A timer
> > function is periodically invoked to sample the CPU busy info (e.g.
> > will get increased due to the running of a CPU-intensive workload).
> > However, the final calculated target value is clamped into the
> > [min_pct, max_pct] limit interval.
> > The Performance governor is actually a special case of Powersave, when
> > the min_pct= max_pct=100%. This is the same as the old performance
> governor.
> 
> So a true powersave one would then be accomplished by setting min_pct =
> max_pct = <some value smaller than 100>%. Is there a limit on the valid
> percentages to be specified here?


In the old driver, a powersave governor just sets the CPU to run with the 
lowest possible performance state. This one does not exist in the intel_pstate 
driver. 
The intel_pstate driver changes the terminology by using "powersave" to refer 
to the previous "ondemand" case. This does make people feel confused. But we 
may think it this way: it only has two modes, the max performance mode and the 
ondemand mode. "ondemand" is the one who saves power (actually in a more 
reasonable way compared to the previous "powersave" which simply sets the CPU 
to run with the lowest performance state). Anyway, we can surely change the 
name if it sounds uncomfortable.

The valid pct value range is 0 to 100. 

> 
> Also, you calling "powersave" what supposedly is "ondemand"
> makes me nervous about it not immediately raising the CPU freq when load
> increases, yet imo that's a fundamental requirement for server kind loads
> where you don't want to run in "performance" mode. Can you clarify the
> behavior here?

The timer fires very 10ms to update the CPU P-state according to the sampled 
workload info.

Best,
Wei

> 
> > Here in the ported version, the limit interval can be set via the new
> > added interfaces in xenpm. I think we can make use of only the
> > Powersave governor, and the Performance governor can actually be
> > simply achieved by setting min_pct= max_pct=100%.
> >
> > If all of you are agree, I will remove the Performance governor
> > related code in the next version of patchset.
> 
> Yes, this certainly seems to make sense with the above.
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.