[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 06/15] cpufreq: Add Hardware P-State (HWP) driver


  • To: Jason Andryuk <jandryuk@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 11 Jul 2023 16:40:58 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3eCMYDj8YVuWQ5O9a/ggtOvR1J3EVo9UBfTkFfcnpJw=; b=N9HiKWfl4OcnsAruIm9rzNRAC4DGJzUMZDXJyK2Xm4jzivqtxwR+IvpxxtoViBgQDrpNY1vsHzGgGA6fzfPh4nnB553KueZs+VLJqg9a0bDqRuYxSR5aa5AlZAID+2tM9WKfK0wExJ1s+gcQlnw14kDbnC1+XfmNuO1av+44/GUvnrJeBoPVG+0mFh55ytCJKofsf/hOCN7vnHPo1iBALBwVpDkhiu04x+B6rCkXevGkS/CXl0oEPiTIdPXc8O+zbPf4BC04F8x0yWe1b2RIvQpis23pXUtB1ME/R8Mwbrf5W7W2VLsgc30+pjNnEjeCITsTp40yU1mwZDb9tpw1+Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CdZ8vHYtQVvmz593Ayc/1mRsem48QQmUACU71+fhNX7Tr7gvitIonSnOiXNDg5Es6LD4V10z3dZEVwXjwqfreYW3YIftQCZcT9vvjQ9v/g7jZf3dl5JkJLFVFPQAfgQakUSP8xNRE0kUIBW320bz4omx0NWb0nG453zwFUIx8GwoTD4/gI88bd79KzjVXfBHee9KhQxHqAoBt9T5kFlsB76qGuff3r/iHndsp2uLF5O3DhPwXbNY2K7Bd2dlgwiXeYJnUP0SsP4Sirc8Dp8unQZ3Lq15NS6AdLBgoUEwXpYZm5RIp+JiU9U5/zk9H+Vb2smDwAaBnGTKBE8x5/XTjw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 11 Jul 2023 14:41:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11.07.2023 16:16, Jason Andryuk wrote:
> On Tue, Jul 11, 2023 at 4:18 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>> On 10.07.2023 17:22, Jason Andryuk wrote:
>>> On Mon, Jul 10, 2023 at 9:13 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>> On 06.07.2023 20:54, Jason Andryuk wrote:
>>>>> @@ -510,6 +510,22 @@ choice of `dom0-kernel` is deprecated and not 
>>>>> supported by all Dom0 kernels.
>>>>>  * `<maxfreq>` and `<minfreq>` are integers which represent max and min 
>>>>> processor frequencies
>>>>>    respectively.
>>>>>  * `verbose` option can be included as a string or also as 
>>>>> `verbose=<integer>`
>>>>> +  for `xen`.  It is a boolean for `hwp`.
>>>>> +* `hwp` selects Hardware-Controlled Performance States (HWP) on 
>>>>> supported Intel
>>>>> +  hardware.  HWP is a Skylake+ feature which provides better CPU power
>>>>> +  management.  The default is disabled.  If `hwp` is selected, but 
>>>>> hardware
>>>>> +  support is not available, Xen will fallback to cpufreq=xen.
>>>>> +* `<hdc>` is a boolean to enable Hardware Duty Cycling (HDC).  HDC 
>>>>> enables the
>>>>> +  processor to autonomously force physical package components into idle 
>>>>> state.
>>>>> +  The default is enabled, but the option only applies when `hwp` is 
>>>>> enabled.
>>>>> +
>>>>> +There is also support for `;`-separated fallback options:
>>>>> +`cpufreq=hwp,verbose;xen`.  This first tries `hwp` and falls back to 
>>>>> `xen`
>>>>> +if unavailable.
>>>>
>>>> In the given example, does "verbose" also apply to the fallback case? If 
>>>> so,
>>>> perhaps better "cpufreq=hwp;xen,verbose", to eliminate that ambiguity?
>>>
>>> Yes, "verbose" is applied to both.  I can make the change.  I
>>> mentioned it in the commit message, but I'll mention it here as well.
>>
>> FTAOD my earlier comment implied that the spelling form you use above
>> should not even be accepted when parsing. I.e. it was not just about
>> the doc aspect.
> 
> Oh.  So what exactly do you want then?
> 
> There is a single cpufreq_verbose variable today that is set by either
> cpufreq=hwp,verbose or cpufreq=xen,verbose.  Is that okay, or should
> the "xen" and "hwp" each get a separate variable?
> 
> Do you only want to allow a single trailing "verbose" to apply to all
> of cpufreq (cpufreq=$foo,verbose)?  Or do you want "verbose" to be
> only valid for "xen"?  Both cpufreq_cmdline_parse() and
> hwp_cmdline_parse() just loop over their options and don't care about
> order, even though the documentation lists verbose last.  Would you
> want "cpufreq=hwp,verbose,hdc" to fail to parse?
> 
> All parsing is done upfront before knowing whether "xen" or "hwp" will
> be used as the cpufreq driver, so there is a trickiness for
> implementing "verbose" only for one option.  Similarly,
> "cpufreq=hwp,invalid;xen" will try "hwp" (but not "xen")  since the
> live variables are updated.  Even without this patch, cpufreq will be
> configured up to an invalid parameter.

Right, and I'd like to see "hwp;xen" to be treated as a "unit", with
",verbose" applying to whichever succeeds initializing. I don't think
there is much point to have separate verbosity variables.

> FYI, cpufreq=xen;hwp will be accepted.  "xen" shouldn't fail, so it
> doesn't make sense to specify that.  But it didn't seem necessary to
> prevent it.

Sure, that's fine.

>>>>> +    if ( data->curr_req.raw == -1 )
>>>>> +    {
>>>>> +        hwp_err(cpu, "Could not initialize HWP properly\n");
>>>>> +        per_cpu(hwp_drv_data, cpu) = NULL;
>>>>> +        xfree(data);
>>>>> +        return -ENODEV;
>>>>> +    }
>>>>> +
>>>>> +    data->minimum = data->curr_req.min_perf;
>>>>> +    data->maximum = data->curr_req.max_perf;
>>>>> +    data->desired = data->curr_req.desired;
>>>>> +    data->energy_perf = data->curr_req.energy_perf;
>>>>> +    data->activity_window = data->curr_req.activity_window;
>>>>> +
>>>>> +    if ( cpu == 0 )
>>>>> +        hwp_verbose("CPU%u: HWP_CAPABILITIES: %016lx\n", cpu, 
>>>>> data->hwp_caps);
>>>>
>>>> While I'm fine with this (perhaps apart from you using "cpu == 0",
>>>> which is an idiom we're trying to get rid of), ...
>>>
>>> Oh, I didn't know that.  What is the preferred way to identify the
>>> BSP?
>>
>> Sometimes we pass a separate boolean to functions, in other cases we
>> check whether a struct cpuinfo_x86 * equals &boot_cpu_info. The
>> latter clearly can't be used here, and the former doesn't look to be
>> a good fit either. However, ...
>>
>>>  This doesn't necessarily run on the BSP, so "cpu"/"policy->cpu"
>>> is all we have to make a determination.
>>
>> ... isn't it, conversely, the case that the function only ever runs
>> on "cpu" when it is the BSP? In which case "cpu == smp_processor_id()"
>> ought to do the trick.
> 
> The calls do not necessarily run from the BSP.  The cpufreq init
> callbacks run later when dom0 uploads the ACPI processor data.

Oh, of course. How did I manage to forget?

>  If you
> don't want "cpu == 0", maybe just print for the first CPU regardless
> of number, and then print differences from that?

Perhaps best then.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.