[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 7 Oct 2020 18:41:17 +0200
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • Delivery-date: Wed, 07 Oct 2020 16:41:47 +0000
  • Ironport-sdr: uo32Kg+112jpcabMTHrE6EdCxeX8qLnGWbJqKBHj6pM7d5CPsclwZS9R3bB8VcxuzHlJY7i1lj YuNgZpvbYuRHvKuv/Rl7tQSzsazFnMvfZ0i9l7iqPfoCCs+69ucpVbgxUCqnyXxFbRhZu8XcMO fI6ChVdJMyVXOb5kNMdRNHtd6rO3oI+V/qTlhLiY+OQ3hk/f7nQBQXF3qtOpIrZggyiy2t7Fv5 jyR06tiAh352BPah4BgrgUUCbhGBMS9kqIo+/+j+g/lkHq+xPCK58whYdrNfVPuwmIGGffSkJN 4k0=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Oct 07, 2020 at 01:06:08PM +0100, Andrew Cooper wrote:
> On 06/10/2020 17:23, Roger Pau Monne wrote:
> > Currently a PV hardware domain can also be given control over the CPU
> > frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
> 
> This might be how the current logic "works", but its straight up broken.
> 
> PERF_CTL is thread scope, so unless dom0 is identity pinned and has one
> vcpu for every pcpu, it cannot use the interface correctly.

Selecting cpufreq=dom0-kernel will force vCPU pinning. I'm not able
however to see anywhere that would force dom0 vCPUs == pCPUs.

> > However since commit 322ec7c89f6 the default behavior has been changed
> > to reject accesses to not explicitly handled MSRs, preventing PV
> > guests that manage CPU frequency from reading
> > MSR_IA32_PERF_{STATUS/CTL}.
> >
> > Additionally some HVM guests (Windows at least) will attempt to read
> > MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
> >
> > vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
> > d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
> >
> > Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> > handling shared between HVM and PV guests, and add an explicit case
> > for reads to MSR_IA32_PERF_{STATUS/CTL}.
> 
> OTOH, PERF_CTL does have a seemingly architectural "please disable turbo
> for me" bit, which is supposed to be for calibration loops.  I wonder if
> anyone uses this, and whether we ought to honour it (probably not).

If we let guests play with this we would have to save/restore the
guest value on context switch. Unless there's a strong case for this,
I would say no.

> > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> > index d8ed83f869..41baa3b7a1 100644
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -1069,6 +1069,12 @@ extern enum cpufreq_controller {
> >      FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
> >  } cpufreq_controller;
> >  
> > +static inline bool is_cpufreq_controller(const struct domain *d)
> > +{
> > +    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> > +            is_hardware_domain(d));
> 
> This won't compile on !CONFIG_X86, due to CONFIG_HAS_CPUFREQ

It does seem to build on Arm, because this is only used in x86 code:

https://gitlab.com/xen-project/people/royger/xen/-/jobs/778207412

The extern declaration of cpufreq_controller is just above, so if you
tried to use is_cpufreq_controller on Arm you would get a link time
error, otherwise it builds fine. The compiler removes the function on
Arm as it has the inline attribute and it's not used.

Alternatively I could look into moving cpufreq_controller (and
is_cpufreq_controller) out of sched.h into somewhere else, I haven't
looked at why it needs to live there.

> Honestly - I don't see any point to this code.  Its opt-in via the
> command line only, and doesn't provide adequate checks for enablement. 
> (It's not as if we're lacking complexity or moving parts when it comes
> to power/frequency management).

Right, I could do a pre-patch to remove this, but I also don't think
we should block this fix on removing FREQCTL_dom0_kernel, so I would
rather fix the regression and then remove the feature if we agree it
can be removed.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.