[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/cpufreq: Reset policy after enabling/disabling turbo status



On 15/11/2021 10:53, Jan Beulich wrote:
> On 12.11.2021 19:51, Andrew Cooper wrote:
>> On 10/11/2021 09:19, Jane Malalane wrote:
>>> Before, user would change turbo status but this had no effect: boolean
>>> was set but policy wasn't reevaluated.  Policy must be reevaluated so
>>> that CPU frequency is chosen according to the turbo status and the
>>> current governor.
>>>
>>> Therefore, add __cpufreq_governor() in cpufreq_update_turbo().
>>>
>>> Reported-by: <edvin.torok@xxxxxxxxxx>
>>> Signed-off-by: <jane.malalane@xxxxxxxxxx>
>>> ---
>>> CC: Jan Beulich <jbeulich@xxxxxxxx>
>>> CC: Ian Jackson <iwj@xxxxxxxxxxxxxx>
>>> ---
>>>
>>> Release rationale:
>>> Not taking this patch means that turbo status is misleading.
>>>
>>> Taking this patch is low-risk as it only uses a function that already
>>> exists and is already used for setting the chosen scaling governor.
>>> Essentially, this change is equivalent to running 'xenpm
>>> en/disable-turbo-mode' and, subsequently, running 'xenpm
>>> set-scaling-governor <current governor>'.
>>> ---
>>>  xen/drivers/cpufreq/utility.c | 6 +++++-
>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
>>> index b93895d4dd..5f200ff3ee 100644
>>> --- a/xen/drivers/cpufreq/utility.c
>>> +++ b/xen/drivers/cpufreq/utility.c
>>> @@ -417,10 +417,14 @@ int cpufreq_update_turbo(int cpuid, int new_state)
>>>      {
>>>          ret = cpufreq_driver.update(cpuid, policy);
>>>          if (ret)
>>> +        {
>>>              policy->turbo = curr_state;
>>> +            return ret;
>>> +        }
>>>      }
>>>  
>>> -    return ret;
>>> +    /* Reevaluate current CPU policy. */
>>> +    return __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
>>>  }
>> So, having looked through the manual, what the cpufreq_driver does for
>> turbo on Intel is bogus according to the SDM.
>>
>> There is a non-architectrual dance with IA32_MISC_ENABLE bit 38 (per
>> package) for firmware to configure turbo, but it manifests as another
>> dynamic CPUID bit (which I think we handle correctly).  It has the same
>> semantics as XD_DISABLE and CPUID_LIMIT so we might want to consider
>> adding it to the set of bits we clear during boot.
> This is quite confusing in the docs - at least one of the tables calls
> the bit "IDA Disable", while other entries at least also refer to the
> effect of disabling IDA. I'm afraid I have trouble connecting turbo
> mode and IDA disabling, unless both are two different names of the
> same thing. Maybe they really are, except that SDM vol 2 uses yet
> another slightly different term for CPUID[6].EAX[1]: "Intel Turbo Boost
> Technology".

SDM Vol3 14.3 uses IDA and turbo interchangeably in some cases.  It
reads as if IDA is the general technology name, and turbo is a sub-mode
for "doing it automatically without software involvement".

On CPUs which support IDA, the CPUID bit is !MISC_ENABLE[38], with
further adds to the confusion of which is which.

>> However, the correct way to turn off turbo is to set
>> IA32_PERF_CTL.TURBO_DISENGAGE bit, which is per logical processor.  As
>> such, it *should* behave far more like the AMD CPB path.
> I'm afraid public documentation has no mention of a bit of this name.
> Considering the above I wonder whether you mean "IDA engage" (bit 32),
> albeit this doesn't seem very likely when you're taking about a
> "disengage" bit.

It is that bit.  Despite the name given in Vol4 Table 2-12, it is "set
to disable".

Vol3 Figure 14-2 explicitly calls it the "IDA/Turbo disengage" bit and
the surrounding text makes it clear that it is disable bit, not an
enable bit.  Also, that's how the Linux intel_pstate driver uses it.

>  If it is, we'd still need to cope with it being
> unavailable: While as per the doc it exists from Merom onwards, i.e.
> just far enough back for all 64-bit capable processors to be covered,
> at least there it is attributed "Mobile only".

Honestly, the number of errors in those tables are alarming.  The MSR is
has been around longer than turbo, and I'm told that the interface has
never changed since its introduction.

Looking across Linux, there's a mess too.

acpi-cpufreq and energy_perf_policy use the MISC_ENABLE bit (which is
package wide)
intel_ips driver uses PERF_CTL (which is logical processor)
intel_pstate uses MISC_ENABLE || max_pstate == turbo_pstate

I'm certain I've seen "one pstate being special" WRT turbo before, but I
can't locate anything about this in the SDM, which possibly means it is
model specific.

~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.