[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v5 10/18] xen/cpufreq: introduce a new amd cppc driver for cpufreq scaling
[Public] > -----Original Message----- > From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: Wednesday, July 2, 2025 6:48 PM > To: Penny, Zheng <penny.zheng@xxxxxxx> > Cc: Huang, Ray <Ray.Huang@xxxxxxx>; Andrew Cooper > <andrew.cooper3@xxxxxxxxxx>; Roger Pau Monné <roger.pau@xxxxxxxxxx>; xen- > devel@xxxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH v5 10/18] xen/cpufreq: introduce a new amd cppc driver for > cpufreq scaling > > On 02.07.2025 11:49, Penny, Zheng wrote: > >> -----Original Message----- > >> From: Jan Beulich <jbeulich@xxxxxxxx> > >> Sent: Tuesday, June 17, 2025 12:00 AM > >> To: Penny, Zheng <penny.zheng@xxxxxxx> > >> > >> On 27.05.2025 10:48, Penny Zheng wrote: > >>> --- a/xen/arch/x86/acpi/cpufreq/amd-cppc.c > >>> +++ b/xen/arch/x86/acpi/cpufreq/amd-cppc.c > >>> + /* > >>> + * We don't need to convert to kHz for computing offset and can > >>> + * directly use nominal_mhz and lowest_mhz as the division > >>> + * will remove the frequency unit. > >>> + */ > >>> + offset = data->caps.nominal_perf - > >>> + (mul * cppc_data->cpc.nominal_mhz) / div; > >>> + } > >>> + else > >>> + { > >>> + /* Read Processor Max Speed(MHz) as anchor point */ > >>> + mul = data->caps.highest_perf; > >>> + div = this_cpu(pxfreq_mhz); > >>> + if ( !div ) > >>> + return -EINVAL; > >> > >> What's wrong about the function arguments in this case? (Same > >> question again on further uses of EINVAL below.) > > > > If we could not get processor max frequency, the whole function is > > useless... > > Maybe -EOPNOTSUPP is more suitable than -EINVAL; > > I don't like EOPNOTSUPP very much either for the purpose, but it's surely > better > than EINVAL. > > >>> +static int cf_check amd_cppc_cpufreq_target(struct cpufreq_policy > >>> *policy, > >>> + unsigned int target_freq, > >>> + unsigned int relation) { > >>> + unsigned int cpu = policy->cpu; > >>> + const struct amd_cppc_drv_data *data = per_cpu(amd_cppc_drv_data, > cpu); > >>> + uint8_t des_perf; > >>> + int res; > >>> + > >>> + if ( unlikely(!target_freq) ) > >>> + return 0; > >>> + > >>> + res = amd_cppc_khz_to_perf(data, target_freq, &des_perf); > >>> + if ( res ) > >>> + return res; > >>> + > >>> + /* > >>> + * Setting with "lowest_nonlinear_perf" to ensure governoring > >>> + * performance in P-state range. > >>> + */ > >>> + amd_cppc_write_request(policy->cpu, data->caps.lowest_nonlinear_perf, > >>> + des_perf, data->caps.highest_perf); > >> > >> I fear I don't understand the comment, and hence it remains unclear > >> to me why lowest_nonlinear_perf is being used here. > > > > How about > > ``` > > Choose lowest nonlinear performance as the minimum performance level at > > which > the platform may run. > > Lowest nonlinear performance is the lowest performance level at which > > nonlinear power savings are achieved, Above this threshold, lower > > performance > levels should be generally more energy efficient than higher performance > levels. > > ``` > > I finally had to go to the ACPI spec to understand what this is about. There > looks to > be an implication that lowest <= lowest_nonlinear, and states in that range > would > correspond more to T-states than to P-states. With that I think I agree with > the use Yes, It doesn't have definitive conclusion about relation between lowest and lowest_nonlinear In our internal FW designed spec, it always shows lowest_nonlinear corresponds to P2 > of lowest_nonlinear here. The comment, however, could do with moving farther > away from merely quoting the pretty abstract text in the ACPI spec, as such > quoting doesn't help in clarifying terminology used, when that terminology > also isn't > explained anywhere else in the code base. How about we add detailed explanations about each terminology in the beginning declaration , see: ``` +/* + * Field highest_perf, nominal_perf, lowest_nonlinear_perf, and lowest_perf + * contain the values read from CPPC capability MSR. + * Field highest_perf represents highest performance, which is the absolute + * maximum performance an individual processor may reach, assuming ideal + * conditions + * Field nominal_perf represents maximum sustained performance level of the + * processor, assuming ideal operating conditions. + * Field lowest_nonlinear_perf represents Lowest Nonlinear Performance, which + * is the lowest performance level at which nonlinear power savings are + * achieved. Above this threshold, lower performance levels should be + * generally more energy efficient than higher performance levels. + * Field lowest_perf represents the absolute lowest performance level of the + * platform. + * + * Field max_perf, min_perf, des_perf store the values for CPPC request MSR. + * Field max_perf conveys the maximum performance level at which the platform + * may run. And it may be set to any performance value in the range + * [lowest_perf, highest_perf], inclusive. + * Field min_perf conveys the minimum performance level at which the platform + * may run. And it may be set to any performance value in the range + * [lowest_perf, highest_perf], inclusive but must be less than or equal to + * max_perf. + * Field des_perf conveys performance level Xen is requesting. And it may be + * set to any performance value in the range [min_perf, max_perf], inclusive. + */ +struct amd_cppc_drv_data +{ + const struct xen_processor_cppc *cppc_data; + union { + uint64_t raw; + struct { + unsigned int lowest_perf:8; + unsigned int lowest_nonlinear_perf:8; + unsigned int nominal_perf:8; + unsigned int highest_perf:8; + unsigned int :32; + }; + } caps; + union { + uint64_t raw; + struct { + unsigned int max_perf:8; + unsigned int min_perf:8; + unsigned int des_perf:8; + unsigned int epp:8; + unsigned int :32; + }; + } req; + + int err; +}; `` Then here, we could elaborate the reason why we choose lowest_nonlinear_perf over lowest_perf: ``` + /* + * Having a performance level lower than the lowest nonlinear + * performance level, such as, lowest_perf <= perf <= lowest_nonliner_perf, + * may actually cause an efficiency penalty, So when deciding the min_perf + * value, we prefer lowest nonlinear performance over lowest performance + */ + amd_cppc_write_request(policy->cpu, data->caps.lowest_nonlinear_perf, + des_perf, data->caps.highest_perf); ``` > > Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |