[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86: retrieve and log CPU frequency information



On Fri, May 15, 2020 at 11:08:04AM +0200, Jan Beulich wrote:
> On 15.05.2020 10:32, Roger Pau Monné wrote:
> > On Wed, Apr 15, 2020 at 01:55:24PM +0200, Jan Beulich wrote:
> >> While from just a single Skylake system it is already clear that we
> >> can't base any of our logic on CPUID leaf 15 [1] (leaf 16 is
> >> documented to be used for display purposes only anyway), logging this
> >> information may still give us some reference in case of problems as well
> >> as for future work. Additionally on the AMD side it is unclear whether
> >> the deviation between reported and measured frequencies is because of us
> >> not doing well, or because of nominal and actual frequencies being quite
> >> far apart.
> >>
> >> The chosen variable naming in amd_log_freq() has pointed out a naming
> >> problem in rdmsr_safe(), which is being taken care of at the same time.
> >> Symmetrically wrmsr_safe(), being an inline function, also gets an
> >> unnecessary underscore dropped from one of its local variables.
> >>
> >> [1] With a core crystal clock of 24MHz and a ratio of 216/2, the
> >>     reported frequency nevertheless is 2600MHz, rather than the to be
> >>     expected (and calibrated by both us and Linux) 2592MHz.
> >>
> >> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Thanks, but please clarify whether this is with or without the
> two suggested changes (breaking out intel_log_freq() and introducing
> local variables for (uint8_t)(msrval >> NN)), or whether you
> mean to leave it to me whether to make them. And if I'm to make the
> change, whether you'd trust me to not screw up things, i.e. whether
> I can keep you R-b in that case.

None of those are mandatory in order to keep the RB, just suggestions.

> >> +#undef PCI_ECS_ADDRESS
> >> +          wrmsrl(MSR_AMD64_NB_CFG, nbcfg);
> >> +  }
> >> +
> >> +  lo = 0; /* gcc may not recognize the loop having at least 5 iterations 
> >> */
> >> +  for (h = c->x86 == 0x10 ? 5 : 8; h--; )
> >> +          if (!rdmsr_safe(0xC0010064 + h, lo) && (lo >> 63))
> >> +                  break;
> >> +  if (!(lo >> 63))
> >> +          return;
> > 
> > Should you also take the P-state limit here into account (from MSR
> > 0xC0010061)?
> > 
> > I assume the firmware could set a minimum P-state higher than the
> > possible ones present in the list of P-states, effectively preventing
> > switching to lowest-performance P-states?
> 
> We're not after permitted P-states here - these would matter only if
> we were meaning to alter the current P-state by direct MSR accesses.
> Here we're only after logging capabilities (and the P-state limits
> can, aiui, in principle also change at runtime).

OK, I have to admit I'm not aware of how this is supposed to work.

One scenario I had in mind was that the firmware would set P-state
control to a value higher than the minimum one in order to prevent the
OS from switching to a lower P-state, in which case reporting the
frequency of the minimum P-state would be kind of incorrect since
switching to it won't be possible anyway. But if this is supposed to
change at runtime then reporting the lowest possible makes sense.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.