[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 07/14 RESEND] cpufreq: Export HWP parameters to userspace



On Thu, May 11, 2023 at 10:10 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 11.05.2023 15:49, Jason Andryuk wrote:
> > On Thu, May 11, 2023 at 2:21 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >>
> >> On 10.05.2023 19:49, Jason Andryuk wrote:
> >>> On Mon, May 8, 2023 at 6:26 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >>>>
> >>>> On 01.05.2023 21:30, Jason Andryuk wrote:
> >>>>> Extend xen_get_cpufreq_para to return hwp parameters.  These match the
> >>>>> hardware rather closely.
> >>>>>
> >>>>> We need the features bitmask to indicated fields supported by the actual
> >>>>> hardware.
> >>>>>
> >>>>> The use of uint8_t parameters matches the hardware size.  uint32_t
> >>>>> entries grows the sysctl_t past the build assertion in setup.c.  The
> >>>>> uint8_t ranges are supported across multiple generations, so hopefully
> >>>>> they won't change.
> >>>>
> >>>> Still it feels a little odd for values to be this narrow. Aiui the
> >>>> scaling_governor[] and scaling_{max,min}_freq fields aren't (really)
> >>>> used by HWP. So you could widen the union in struct
> >>>> xen_get_cpufreq_para (in a binary but not necessarily source compatible
> >>>> manner), gaining you 6 more uint32_t slots. Possibly the somewhat oddly
> >>>> placed scaling_cur_freq could be included as well ...
> >>>
> >>> The values are narrow, but they match the hardware.  It works for HWP,
> >>> so there is no need to change at this time AFAICT.
> >>>
> >>> Do you want me to make this change?
> >>
> >> Well, much depends on what these 8-bit values actually express (I did
> >> raise this question in one of the replies to your patches, as I wasn't
> >> able to find anything in the SDM). That'll then hopefully allow to
> >> make some educated prediction on on how likely it is that a future
> >> variant of hwp would want to widen them.
> >
> > Sorry for not providing a reference earlier.  In the SDM,
> > HARDWARE-CONTROLLED PERFORMANCE STATES (HWP) section, there is this
> > second paragraph:
> > """
> > In contrast, HWP is an implementation of the ACPI-defined
> > Collaborative Processor Performance Control (CPPC), which specifies
> > that the platform enumerates a continuous, abstract unit-less,
> > performance value scale that is not tied to a specific performance
> > state / frequency by definition. While the enumerated scale is roughly
> > linear in terms of a delivered integer workload performance result,
> > the OS is required to characterize the performance value range to
> > comprehend the delivered performance for an applied workload.
> > """
> >
> > The numbers are "continuous, abstract unit-less, performance value."
> > So there isn't much to go on there, but generally, smaller numbers
> > mean slower and bigger numbers mean faster.
> >
> > Cross referencing the ACPI spec here:
> > https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#collaborative-processor-performance-control
> >
> > Scrolling down you can find the register entries such as
> >
> > Highest Performance
> > Register or DWORD Attribute:  Read
> > Size:                         8-32 bits
> >
> > AMD has its own pstate implementation that is similar to HWP.  Looking
> > at the Linux support, the AMD hardware also use 8 bit values for the
> > comparable fields:
> > https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/msr-index.h#L612
> >
> > So Intel and AMD are 8bit for now at least.  Something could do 32bits
> > according to the ACPI spec.
> >
> > 8 bits of granularity for slow to fast seems like plenty to me.  I'm
> > not sure what one would gain from 16 or 32 bits, but I'm not designing
> > the hardware.  From the earlier xenpm output, "highest" was 49, so
> > still a decent amount of room in an 8 bit range.
>
> Hmm, thanks for the pointers. I'm still somewhat undecided. I guess I'm
> okay with you keeping things as you have them. If and when needed we can
> still rework the structure - it is possible to change it as it's (for
> the time being at least) still an unstable interface.

With an anonymous union and anonymous struct, struct
xen_get_cpufreq_para can be re-arranged and compile without any
changes to other cpufreq code.  struct xen_hwp_para becomes 10
uint32_t's.  The old scaling is 3 * uint32_t + 16 bytes
CPUFREQ_NAME_LEN + 4 * uint32_t for xen_ondemand = 11 uint32_t.  So
int32_t turbo_enabled doesn't move and it's binary compatible.

Anonymous unions and structs aren't allowed in the public header
though, right?  So that would need to change, though it doesn't seem
too bad.  There isn't too much churn.

I have no plans to tackle AMD pstate.  But having glanced at it this
morning, maybe these hwp sysctls should be renamed cppc?  AMD pstate
and HWP are both implementations of CPPC, so that could be more future
proof?  But, again, I only glanced at the AMD stuff, so there may be
other changes needed.

Regards,
Jason



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.