[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver
>>> On 29.05.15 at 10:19, <wei.w.wang@xxxxxxxxx> wrote: > On 26/05/2015 21:58, Jan Beulich wrote >> >>> On 13.05.16 at 09:50, <wei.w.wang@xxxxxxxxx> wrote: >> > +static int intel_pstate_verify_policy(struct cpufreq_policy *policy) >> > +{ >> > + cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq, >> > + policy->cpuinfo.max_freq); >> > + >> > + if ( policy->policy != CPUFREQ_POLICY_POWERSAVE && >> > + policy->policy != CPUFREQ_POLICY_PERFORMANCE && >> > + policy->policy != CPUFREQ_POLICY_USERSPACE && >> > + policy->policy != CPUFREQ_POLICY_ONDEMAND ) >> >> switch() > > How would we use switch() here? switch ( policy->policy ) { case CPUFREQ_POLICY_POWERSAVE: etc. But I thought that to be obvious, so I'm not sure I understand what you don't understand. >> > +static int intel_pstate_cpu_init(struct cpufreq_policy *policy) { >> > + struct cpudata *cpu; >> > + int rc; >> > + >> > + rc = intel_pstate_init_cpu(policy->cpu); >> >> Having both intel_pstate_cpu_init() and intel_pstate_init_cpu() is at least >> odd, the more that the latter is being called from only here. > > Are you suggesting to change the function name? If so, how about changing > intel_pstate_cpu_init() to intel_pstate_setup()? Either that, or fold the called function into the caller. >> > + if (rc) >> > + return rc; >> > + >> > + cpu = all_cpu_data[policy->cpu]; >> > + if (limits.min_perf_pct == 100 && limits.max_perf_pct == 100) >> > + policy->policy = CPUFREQ_POLICY_PERFORMANCE; >> > + else >> > + policy->policy = CPUFREQ_POLICY_ONDEMAND; >> > + >> > + policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling; >> > + policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling; >> > + policy->min_perf_pct = 0; >> > + policy->max_perf_pct = 100; >> > + policy->turbo_pct = get_turbo_pct(); >> > + >> > + /* cpuinfo and default policy values */ >> > + policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu- >> >pstate.scaling; >> > + policy->cpuinfo.max_freq = >> > + cpu->pstate.turbo_pstate * cpu->pstate.scaling; >> > + policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL; >> > + cpumask_set_cpu(policy->cpu, policy->cpus); >> > + >> > + /* the first cpu do the init work for limits.min/max_policy_pct */ >> > + if (cpu->cpu == 0) { >> >> cpu->cpu == 0 doesn't mean this is the first CPU to come here. > > How about this one: > > If (limits.min_policy_pct == 0) { > limits.min_policy_pct = .... > limits.xx = .... > } > > limits.min_policy_pct won't be 0 if it is initialized. If that's guaranteed - fine with me. >> > +static int intel_pstate_msrs_not_valid(void) { >> > + /* Check that all the msr's we are using are valid. */ >> > + u64 aperf, mperf, tmp; >> > + >> > + rdmsrl(MSR_IA32_APERF, aperf); >> > + rdmsrl(MSR_IA32_MPERF, mperf); >> > + >> > + if (!pstate_funcs.get_max() || >> > + !pstate_funcs.get_min() || >> > + !pstate_funcs.get_turbo()) >> > + return -ENODEV; >> > + >> > + rdmsrl(MSR_IA32_APERF, tmp); >> > + if (!(tmp - aperf)) >> >> Why not "if(tmp == aperf)"? And - is it guaranteed that APERF (and MPERF >> below) incremented in the meantime? > > Will change it to "if (tmp == aperf)". > According to the SDM, IA32_MPERF MSR (E7H) increments in proportion to a > fixed frequency, and IA32_APERF MSR increments in proportion to actual > performance. So, as long as the two MSRs are valid, their values will be > increased. The "in proportion" is what makes me nervous: What if the proportion is 1 in every 1000 cycles? >> > +int __init intel_pstate_init(void) >> > +{ >> > + int cpu, rc = 0; >> > + const struct x86_cpu_id *id; >> > + struct cpu_defaults *cpu_info; >> > + >> > + if (cpuid_ecx(6) & 0x1) >> > + set_bit(X86_FEATURE_APERFMPERF, >> > + &boot_cpu_data.x86_capability); >> >> This should be consolidated out of the other cpufreq drivers into >> cpu/common.c. > > How about moving it to the bottom of init_intel() in cpu/intel.c ? It's not Intel specific, so it belongs in cpu/common.c. >> > --- a/xen/include/asm-x86/acpi.h >> > +++ b/xen/include/asm-x86/acpi.h >> > @@ -32,6 +32,8 @@ >> > #define COMPILER_DEPENDENT_INT64 long long >> > #define COMPILER_DEPENDENT_UINT64 unsigned long long >> > >> > +extern int intel_pstate_init(void); >> >> Why in acpi.h? > > I was thinking that p-state is part of ACPI. Do you prefer to create a new .h > called cpufreq.h, just like the cpuidle.h there? ACPI is a way to convey information about P- (and C- and T-) states, but the latter are imo not tied to ACPI. In fact your patch series here proves the opposite: You add code dealing with P-states without using information coming from ACPI. I think this should go into what currently is acpi/cpufreq/cpufreq.h, which is expected to be moved out of acpi/ anyway (I seem to even recall an ARM side series aiming at doing that). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |