[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 04/13] cpufreq: Add Hardware P-State (HWP) driver



On Fri, May 28, 2021 at 2:35 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 27.05.2021 20:50, Jason Andryuk wrote:
> > On Wed, May 26, 2021 at 11:00 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >>
> >> On 03.05.2021 21:28, Jason Andryuk wrote:
> >>> +    hwp_verbose("HWP: FAST_IA32_HWP_REQUEST %ssupported\n",
> >>> +                eax & CPUID6_EAX_FAST_HWP_MSR ? "" : "not ");
> >>> +    if ( eax & CPUID6_EAX_FAST_HWP_MSR )
> >>> +    {
> >>> +        if ( rdmsr_safe(MSR_FAST_UNCORE_MSRS_CAPABILITY, val) )
> >>> +            hwp_err("error 
> >>> rdmsr_safe(MSR_FAST_UNCORE_MSRS_CAPABILITY)\n");
> >>> +
> >>> +        hwp_verbose("HWP: MSR_FAST_UNCORE_MSRS_CAPABILITY: %016lx\n", 
> >>> val);
> >>
> >> Missing "else" above here?
> >
> > Are unbalanced braces acceptable or must they be balanced?  Is this 
> > acceptable:
> > if ()
> >   one;
> > else {
> >   two;
> >   three;
> > }
>
> Yes, it is. But I don't see how the question relates to my comment.
> All that needs to go in the else's body is the hwp_verbose().

'val' shouldn't be used to set features when the rdmsr fails, so the
following code needs to be within the else.  Unless you want to rely
on a failed rdmsr returning 0.

> >>> +static void hdc_set_pkg_hdc_ctl(bool val)
> >>> +{
> >>> +    uint64_t msr;
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_PKG_HDC_CTL, msr) )
> >>> +    {
> >>> +        hwp_err("error rdmsr_safe(MSR_IA32_PKG_HDC_CTL)\n");
> >>
> >> I'm not convinced of the need of having such log messages after
> >> failed rdmsr/wrmsr, but I'm definitely against them getting logged
> >> unconditionally. In verbose mode, maybe.
> >
> > We should not fail the rdmsr if our earlier cpuid check passed.  So in
> > that respect these messages can be removed.  The benefit here is that
> > you can see which MSR failed.  If you relied on extable_fixup, you
> > would have to cross reference manually.  How will administors know if
> > hwp isn't working properly there are no messages?
>
> This same question would go for various other MSR accesses which
> might fail, but which aren't accompanied by an explicit log message.
> I wouldn't mind a single summary message reporting if e.g. HWP
> setup failed. More detailed analysis of such would be more of a
> developer's than an administrator's job then anyway.
>
> > For HWP in general, the SDM says to check CPUID for the availability
> > of MSRs.  Therefore rdmsr/wrmsr shouldn't fail.  During development, I
> > saw wrmsr failures when with "Valid Maximum" and other "Valid" bits
> > set.  I think that was because I hadn't set up the Package Request
> > MSR.  That has been fixed by not using those bits.  Anyway,
> > rdmsr/wrmsr shouldn't fail, so how much code should be put into
> > checking for those failures which shouldn't happen?
>
> If there's any risk of accesses causing a fault, using *msr_safe()
> is of course the right choice. Beyond that you will need to consider
> what the consequences are. Sadly this needs doing on every single
> case individually. See "Handling unexpected conditions" section of
> ./CODING_STYLE for guidelines (even if the specific constructs
> aren't in use here).

Using *msr_safe(), I think the worst case is the users can't set HWP
in the way they want.  So power/performance may not be what the users
wanted, but Xen won't crash.  The hardware will throttle itself if
needed for self-protection, so that should be okay as well.

> >>> +static void hdc_set_pm_ctl1(bool val)
> >>> +{
> >>> +    uint64_t msr;
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_PM_CTL1, msr) )
> >>> +    {
> >>> +        hwp_err("error rdmsr_safe(MSR_IA32_PM_CTL1)\n");
> >>> +
> >>> +        return;
> >>> +    }
> >>> +
> >>> +    msr = val ? IA32_PM_CTL1_HDC_Allow_Block : 0;
> >>
> >> Same here then, and ...
> >>
> >>> +static void hwp_fast_uncore_msrs_ctl(bool val)
> >>> +{
> >>> +    uint64_t msr;
> >>> +
> >>> +    if ( rdmsr_safe(MSR_FAST_UNCORE_MSRS_CTL, msr) )
> >>> +        hwp_err("error rdmsr_safe(MSR_FAST_UNCORE_MSRS_CTL)\n");
> >>> +
> >>> +    msr = val;
> >>
> >> ... here (where you imply bit 0 instead of using a proper
> >> constant).
> >>
> >> Also for all three functions I'm not convinced their names are
> >> in good sync with their parameters being boolean.
> >
> > Would you prefer something named closer to the bit names like:
> > hdc_set_pkg_hdc_ctl -> hdc_set_pkg_enable
> > hdc_set_pm_ctl1 -> hdc_set_allow_block
> > hwp_fast_uncore_msrs_ctl -> hwp_fast_request_enable
>
> My primary request is for function name, purpose, and parameter(s)
> to be in line. So e.g. if you left the parameters as boolean, then
> I think your suggested name changes make sense. Alternatively you
> could make the functions e.g. be full register updating ones, with
> suitable parameters, which could be a mask-to-set and mask-to-clear.

I'm going to use the replacement names while keeping the boolean
argument.  Those MSRs only have single bits defined today, so
functions with boolean parameters are simpler.

> >>> +static void hwp_read_capabilities(void *info)
> >>> +{
> >>> +    struct cpufreq_policy *policy = info;
> >>> +    struct hwp_drv_data *data = hwp_drv_data[policy->cpu];
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_HWP_CAPABILITIES, data->hwp_caps) )
> >>> +    {
> >>> +        hwp_err("CPU%u: error rdmsr_safe(MSR_IA32_HWP_CAPABILITIES)\n",
> >>> +                policy->cpu);
> >>> +
> >>> +        return;
> >>> +    }
> >>> +
> >>> +    if ( rdmsr_safe(MSR_IA32_HWP_REQUEST, data->curr_req.raw) )
> >>> +    {
> >>> +        hwp_err("CPU%u: error rdmsr_safe(MSR_IA32_HWP_REQUEST)\n", 
> >>> policy->cpu);
> >>> +
> >>> +        return;
> >>> +    }
> >>> +}
> >>
> >> This function doesn't indicate failure to its caller(s), so am I
> >> to understand that failure to read either ofthe MSRs is actually
> >> benign to the driver?
> >
> > They really shouldn't fail since the CPUID check passed during
> > initialization.  If you can't read/write MSRs, something is broken and
> > the driver cannot function.  The machine would still run, but HWP
> > would be uncontrollable.  How much care should be given to
> > "impossible" situations?
>
> See above. The main point is to avoid crashing. In the specific
> case here I think you could simply drop both the log messages and
> the early return, assuming the caller can deal fine with the zero
> value(s) that rdmsr_safe() will substitute for the MSR value(s).
> Bailing early, otoh, calls for handing back an error indicator
> imo.

I changed it to have failure set curr_req.raw = -1.  Then
cpufreq_driver.init returns -ENODEV in that case.

> >>> +    policy->governor = &hwp_cpufreq_governor;
> >>> +
> >>> +    data = xzalloc(typeof(*data));
> >>
> >> Commonly we specify the type explicitly in such cases, rather than using
> >> typeof(). I will admit though that I'm not entirely certain which one's
> >> better. But consistency across the code base is perhaps preferable for
> >> the time being.
> >
> > I thought typeof(*data) is always preferable since it will always be
> > the matching type.  I can change it, but I feel like it's a step
> > backwards.
>
> There's exactly one similar use in the entire code base. The original
> idea with xmalloc() was that one would explicitly specify the intended
> type, such that without looking elsewhere one can see what exactly is
> to be allocated. One could further rely on the compiler warning if
> there was a disagreement between declaration and assignment.

Oh, okay.  Thanks for the explanation of xmalloc().

> >>> +    if ( feature_hwp_energy_perf )
> >>> +        data->energy_perf = 0x80;
> >>> +    else
> >>> +        data->energy_perf = 7;
> >>
> >> Where's this 7 coming from? (You do mention the 0x80 at least in the
> >> description.)
> >
> > When HWP energy performance preference is unavailable, it falls back
> > to IA32_ENERGY_PERF_BIAS with a 0-15 range.  Per the SDM Vol3 14.3.4,
> > "A value of 7 roughly translates into a hint to balance performance
> > with energy consumption."  I will add a comment.
>
> Actually, as per a comment on a later patch, I'm instead expecting
> you to add a #define, the name of which will serve as comment.

Yes, sounds good.

Regards,
Jason



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.