[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver



On 29/05/2015 16:46, Jan Beulich wrote
> >>> On 29.05.15 at 10:19, <wei.w.wang@xxxxxxxxx> wrote:
> > On 26/05/2015 21:58, Jan Beulich wrote
> >> >>> On 13.05.16 at 09:50, <wei.w.wang@xxxxxxxxx> wrote:
> >> > +static int intel_pstate_verify_policy(struct cpufreq_policy
> >> > +*policy) {
> >> > +    cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
> >> > +                                     policy->cpuinfo.max_freq);
> >> > +
> >> > +    if ( policy->policy != CPUFREQ_POLICY_POWERSAVE   &&
> >> > +         policy->policy != CPUFREQ_POLICY_PERFORMANCE &&
> >> > +         policy->policy != CPUFREQ_POLICY_USERSPACE   &&
> >> > +         policy->policy != CPUFREQ_POLICY_ONDEMAND )
> >>
> >> switch()
> >
> > How would we use switch() here?
> 
>     switch ( policy->policy )
>     {
>     case CPUFREQ_POLICY_POWERSAVE:
> 
> etc. But I thought that to be obvious, so I'm not sure I understand what you
> don't understand.

I thought there would be a special usage of switch() here, but no.
So, using switch, we will have 
switch ( policy->policy )
 {
   case CPUFREQ_POLICY_POWERSAVE:
   case CPUFREQ_POLICY_PERFORMANCE:
   case CPUFREQ_POLICY_USERSPACE:
   case CPUFREQ_POLICY_ONDEMAND:      
                                                return 0;
   case default: 
                                                return -EINVAL
}

Is there a particular reason why we need to change to this style? I think using 
if() looks more straightforward, since this is just a condition check.
                                        
> 
> >> > +static int intel_pstate_msrs_not_valid(void) {
> >> > +    /* Check that all the msr's we are using are valid. */
> >> > +    u64 aperf, mperf, tmp;
> >> > +
> >> > +    rdmsrl(MSR_IA32_APERF, aperf);
> >> > +    rdmsrl(MSR_IA32_MPERF, mperf);
> >> > +
> >> > +    if (!pstate_funcs.get_max() ||
> >> > +        !pstate_funcs.get_min() ||
> >> > +        !pstate_funcs.get_turbo())
> >> > +        return -ENODEV;
> >> > +
> >> > +    rdmsrl(MSR_IA32_APERF, tmp);
> >> > +    if (!(tmp - aperf))
> >>
> >> Why not "if(tmp == aperf)"? And - is it guaranteed that APERF (and
> >> MPERF
> >> below) incremented in the meantime?
> >
> > Will change it to "if (tmp == aperf)".
> > According to the SDM, IA32_MPERF MSR (E7H) increments in proportion to
> > a fixed frequency, and IA32_APERF MSR increments in proportion to
> > actual performance. So, as long as the two MSRs are valid, their
> > values will be increased.
> 
> The "in proportion" is what makes me nervous: What if the proportion is 1 in
> every 1000 cycles?

Right, this may cause a problem. I will remove that part. Then, it will be:

static int intel_pstate_msrs_not_valid(void)
{
         if ( !pstate_funcs.get_max() ||
               !pstate_funcs.get_min() ||
               !pstate_funcs.get_turbo())
                  return -ENODEV;
         
          return 0;
}

> 
> >> > +int __init intel_pstate_init(void) {
> >> > +    int cpu, rc = 0;
> >> > +    const struct x86_cpu_id *id;
> >> > +    struct cpu_defaults *cpu_info;
> >> > +
> >> > +    if (cpuid_ecx(6) & 0x1)
> >> > +        set_bit(X86_FEATURE_APERFMPERF,
> >> > + &boot_cpu_data.x86_capability);
> >>
> >> This should be consolidated out of the other cpufreq drivers into
> >> cpu/common.c.
> >
> > How about moving it to the bottom of init_intel() in cpu/intel.c ?
> 
> It's not Intel specific, so it belongs in cpu/common.c.

Ok, I will put it into the cpu_init() function there.

> 
> >> > --- a/xen/include/asm-x86/acpi.h
> >> > +++ b/xen/include/asm-x86/acpi.h
> >> > @@ -32,6 +32,8 @@
> >> >  #define COMPILER_DEPENDENT_INT64   long long
> >> >  #define COMPILER_DEPENDENT_UINT64  unsigned long long
> >> >
> >> > +extern int intel_pstate_init(void);
> >>
> >> Why in acpi.h?
> >
> > I was thinking that p-state is part of ACPI. Do you prefer to create a
> > new .h called cpufreq.h, just like the cpuidle.h there?
> 
> ACPI is a way to convey information about P- (and C- and T-) states, but the
> latter are imo not tied to ACPI. In fact your patch series here proves the
> opposite: You add code dealing with P-states without using information
> coming from ACPI. I think this should go into what currently is
> acpi/cpufreq/cpufreq.h, which is expected to be moved out of acpi/ anyway
> (I seem to even recall an ARM side series aiming at doing that).

Ok, I will move it back to acpi/cpufreq/cpufreq.h. 

Best,
Wei

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.