[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 06/15] cpufreq: Add Hardware P-State (HWP) driver
On Tue, Jul 11, 2023 at 4:18 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 10.07.2023 17:22, Jason Andryuk wrote: > > On Mon, Jul 10, 2023 at 9:13 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: > >> On 06.07.2023 20:54, Jason Andryuk wrote: > >>> @@ -510,6 +510,22 @@ choice of `dom0-kernel` is deprecated and not > >>> supported by all Dom0 kernels. > >>> * `<maxfreq>` and `<minfreq>` are integers which represent max and min > >>> processor frequencies > >>> respectively. > >>> * `verbose` option can be included as a string or also as > >>> `verbose=<integer>` > >>> + for `xen`. It is a boolean for `hwp`. > >>> +* `hwp` selects Hardware-Controlled Performance States (HWP) on > >>> supported Intel > >>> + hardware. HWP is a Skylake+ feature which provides better CPU power > >>> + management. The default is disabled. If `hwp` is selected, but > >>> hardware > >>> + support is not available, Xen will fallback to cpufreq=xen. > >>> +* `<hdc>` is a boolean to enable Hardware Duty Cycling (HDC). HDC > >>> enables the > >>> + processor to autonomously force physical package components into idle > >>> state. > >>> + The default is enabled, but the option only applies when `hwp` is > >>> enabled. > >>> + > >>> +There is also support for `;`-separated fallback options: > >>> +`cpufreq=hwp,verbose;xen`. This first tries `hwp` and falls back to > >>> `xen` > >>> +if unavailable. > >> > >> In the given example, does "verbose" also apply to the fallback case? If > >> so, > >> perhaps better "cpufreq=hwp;xen,verbose", to eliminate that ambiguity? > > > > Yes, "verbose" is applied to both. I can make the change. I > > mentioned it in the commit message, but I'll mention it here as well. > > FTAOD my earlier comment implied that the spelling form you use above > should not even be accepted when parsing. I.e. it was not just about > the doc aspect. Oh. So what exactly do you want then? There is a single cpufreq_verbose variable today that is set by either cpufreq=hwp,verbose or cpufreq=xen,verbose. Is that okay, or should the "xen" and "hwp" each get a separate variable? Do you only want to allow a single trailing "verbose" to apply to all of cpufreq (cpufreq=$foo,verbose)? Or do you want "verbose" to be only valid for "xen"? Both cpufreq_cmdline_parse() and hwp_cmdline_parse() just loop over their options and don't care about order, even though the documentation lists verbose last. Would you want "cpufreq=hwp,verbose,hdc" to fail to parse? All parsing is done upfront before knowing whether "xen" or "hwp" will be used as the cpufreq driver, so there is a trickiness for implementing "verbose" only for one option. Similarly, "cpufreq=hwp,invalid;xen" will try "hwp" (but not "xen") since the live variables are updated. Even without this patch, cpufreq will be configured up to an invalid parameter. FYI, cpufreq=xen;hwp will be accepted. "xen" shouldn't fail, so it doesn't make sense to specify that. But it didn't seem necessary to prevent it. > >>> + { > >>> + switch ( cpufreq_xen_opts[i] ) > >>> + { > >>> + case CPUFREQ_xen: > >>> + ret = cpufreq_register_driver(&acpi_cpufreq_driver); > >>> + break; > >>> + case CPUFREQ_hwp: > >>> + ret = hwp_register_driver(); > >>> + break; > >>> + } > >>> + > >>> + if ( ret == 0 ) > >>> + break; > >>> + } > >>> break; > >> > >> In this model any kind of failure results in the fallback to be tried > >> (and the fallback's error to be returned to the caller rather than > >> the primary one). This may or may not be what we actually want; > >> personally I would have expected > >> > >> if ( ret != -ENODEV ) > >> break; > >> > >> or some such instead. > > > > I guess this comes back to our fruit preferences. :) > > Does it? It's not just a style question here, but one of when / whether > to use the fallback. Indeed. I was trying to allude back to the earlier conversation. Do we try the pears only when there are no apples or also when the apples are "bad"? Only falling back for no apples is fine. > > I can switch it around like that, and make hwp_register_driver() > > return -ENODEV for hwp_available() returning false. > > Thanks. > > >>> +int __init hwp_cmdline_parse(const char *s, const char *e) > >>> +{ > >>> + do > >>> + { > >>> + const char *end = strpbrk(s, ",;"); > >>> + > >>> + if ( s && !hwp_handle_option(s, end) ) > >> > >> This check of s not being NULL comes too late, as strpbrk() would have > >> de-referenced it already. Considering ... > >> > >>> + { > >>> + printk(XENLOG_WARNING "cpufreq/hwp: option '%s' not > >>> recognized\n", > >>> + s); > >>> + > >>> + return -1; I should change this to -EINVAL. > >>> + } > >>> + > >>> + s = end ? ++end : end; > >>> + } while ( s && s < e ); > >> > >> ... this it probably wants to move even ahead of the loop. > > > > I'll switch from do/while to just while and then the NULL check will > > be covered. In practice, this function is never called with s == > > NULL. > > In which case - why not leave things largely as they are, simply dropping > the odd check of s? Sure. > >>> + if ( data->curr_req.raw == -1 ) > >>> + { > >>> + hwp_err(cpu, "Could not initialize HWP properly\n"); > >>> + per_cpu(hwp_drv_data, cpu) = NULL; > >>> + xfree(data); > >>> + return -ENODEV; > >>> + } > >>> + > >>> + data->minimum = data->curr_req.min_perf; > >>> + data->maximum = data->curr_req.max_perf; > >>> + data->desired = data->curr_req.desired; > >>> + data->energy_perf = data->curr_req.energy_perf; > >>> + data->activity_window = data->curr_req.activity_window; > >>> + > >>> + if ( cpu == 0 ) > >>> + hwp_verbose("CPU%u: HWP_CAPABILITIES: %016lx\n", cpu, > >>> data->hwp_caps); > >> > >> While I'm fine with this (perhaps apart from you using "cpu == 0", > >> which is an idiom we're trying to get rid of), ... > > > > Oh, I didn't know that. What is the preferred way to identify the > > BSP? > > Sometimes we pass a separate boolean to functions, in other cases we > check whether a struct cpuinfo_x86 * equals &boot_cpu_info. The > latter clearly can't be used here, and the former doesn't look to be > a good fit either. However, ... > > > This doesn't necessarily run on the BSP, so "cpu"/"policy->cpu" > > is all we have to make a determination. > > ... isn't it, conversely, the case that the function only ever runs > on "cpu" when it is the BSP? In which case "cpu == smp_processor_id()" > ought to do the trick. The calls do not necessarily run from the BSP. The cpufreq init callbacks run later when dom0 uploads the ACPI processor data. If you don't want "cpu == 0", maybe just print for the first CPU regardless of number, and then print differences from that? Regards, Jason
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |