[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 9 Nov 2020 19:31:02 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Tkv/RO8/ivEQBrckg2t8lPlq1vIMiZWaZKDqTuP15TU=; b=TVX5irkqcuB4kZvSf1M7OQoRo+y8l1JsVg33sDJDv1TfuBZwtWfchEuiEWXv0xH5+V4u3vQtLSJy2WHxaF0d+FUfYDYdxKtKexr+NugPNoy20OiyAGZJuUOQAnLIRWXRLLH+6QcfvL6/R1R9LX+tnl9uxmtXz+/tz2IUTNxBVXGGOy6wp2Qd/y+RA3ChFWV7l/Uav5qXQzJbruh2Mg+5PHPTkVAzjJYDUct36lVVrSDkRTn1snv3mri+LmssIVHQpl+qWQnoHScvd5POQsovqr6/hDeKq2PJjjw86OcHrAN7eJO5JbjjiH0S4aIbS6vKgJN6sFub2STJ/4UI7vQhXw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MQIrBKCeXhL3o7m9xS8VctTS9WTW1KF5OcWs478xxrQEZx8AwnGKZ7r/D6zp4HUA5Eeoua/Uee4XqATTtoC6JggFiJOjXh33xVuUhWgdgE0VrWDHQLOKYr0Jef0bnHHb1LYQzanZriza4tUWI46/pyXDm/3Sof5kPET4uG33JV5ohF0EBu56c4u4ymh+tiGU7LWKjrKCnG1XdmtvXpNS7WodL79/z8VjPfc0fESnGFGlvI9wAKs2DZcCI9929DfvrQ/6EKCVqfWOf5E02FMP8FmkZP05g5eFTk525vQVORWANS58vH5nNMpJ3R+dSMV/IJoHM8qPe/k7We7uj+cN2Q==
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Mon, 09 Nov 2020 18:31:24 +0000
  • Ironport-sdr: s0+yVDksQT/XRSH0K1kEyjAZEbPfgL9se/j8rA87gtavcL9sdDEwlHPPjNSHnbZf9YfwFF4dtX s3ZrlM8ho0dsyxGpRxqMfiFa9frK/Vmcw2F7fcfISRXednhZ+vfppCEqG9Cn5cnk/6PiFeVznz EDonx88a3saOk8WfDRfFaUMAlADSL1FZlx+T2q5J2fqodhtroZjlKWIJhGGNyh8ylByx6+68/s oP5Ex4KM6fkvve1nYJbUFDMK62WALPI0cz/i+cFbiSIuCB6FLVnStk7MWyqUvRSJSQP2FG4d8J H6g=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Nov 09, 2020 at 05:38:19PM +0000, Andrew Cooper wrote:
> From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Currently a PV hardware domain can also be given control over the CPU
> frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
> However since commit 322ec7c89f6 the default behavior has been changed
> to reject accesses to not explicitly handled MSRs, preventing PV
> guests that manage CPU frequency from reading
> MSR_IA32_PERF_{STATUS/CTL}.
> 
> Additionally some HVM guests (Windows at least) will attempt to read
> MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
> 
>   vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
>   d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
> 
> Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
> handling shared between HVM and PV guests, and add an explicit case
> for reads to MSR_IA32_PERF_{STATUS/CTL}.
> 
> Restore previous behavior and allow PV guests with the required
> permissions to read the contents of the mentioned MSRs. Non privileged
> guests will get 0 when trying to read those registers, as writes to
> MSR_IA32_PERF_CTL by such guest will already be silently dropped.
> 
> Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
> Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> ---
> CC: Jan Beulich <JBeulich@xxxxxxxx>
> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> CC: Wei Liu <wl@xxxxxxx>
> 
> v2:
>  * fix is_cpufreq_controller() to exclude PVH dom0, and collapse to nothing in
>    !CONFIG_PV builds
>  * Drop the cross-vendor checks.  It isn't possible to configure dom0 as
>    cross-vendor, and anyone using is_cpufreq_controller() without an exact
>    model match has far bigger problems.

I was on the verge of doing this in v1, but wasn't really sure whether
there was any use case to change the vendor for dom0 cpuid.

>  * At least Centaur implements these MSRs.  Add access.
> ---
>  xen/arch/x86/msr.c             | 34 ++++++++++++++++++++++++++++++++++
>  xen/arch/x86/pv/emul-priv-op.c | 14 --------------
>  xen/include/xen/sched.h        | 17 +++++++++++++++++
>  3 files changed, 51 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> index 9c69ef8792..0a8ae4d22c 100644
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -242,6 +242,25 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t 
> *val)
>              goto gp_fault;
>          break;
>  
> +        /*
> +         * These MSRs are not enumerated in CPUID.  They have been around
> +         * since the Pentium 4, and implemented by other vendors.
> +         *
> +         * Some versions of Windows try reading these before setting up a #GP
> +         * handler, and Linux has several unguarded reads as well.  Provide
> +         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
> +         * have full access.
> +         */
> +    case MSR_IA32_PERF_STATUS:
> +    case MSR_IA32_PERF_CTL:
> +        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
> +            goto gp_fault;
> +
> +        *val = 0;
> +        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 
> )
> +            break;
> +        goto gp_fault;
> +
>      case MSR_IA32_THERM_STATUS:
>          if ( cp->x86_vendor != X86_VENDOR_INTEL )
>              goto gp_fault;
> @@ -448,6 +467,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t 
> val)
>              goto gp_fault;
>          break;
>  
> +        /*
> +         * This MSR are not enumerated in CPUID.  It has been around since 
> the
> +         * Pentium 4, and implemented by other vendors.
> +         *
> +         * To match the RAZ semantics, implement as write-discard, except for
> +         * a cpufreq controller dom0 which has full access.
> +         */
> +    case MSR_IA32_PERF_CTL:
> +        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
> +            goto gp_fault;
> +
> +        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
> +            break;
> +        goto gp_fault;
> +
>      case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
>          if ( !is_hvm_domain(d) || v != curr )
>              goto gp_fault;
> diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
> index 7cc16d6eda..dbceed8a05 100644
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -849,12 +849,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
>      return val;
>  }
>  
> -static inline bool is_cpufreq_controller(const struct domain *d)
> -{
> -    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
> -            is_hardware_domain(d));
> -}
> -
>  static uint64_t guest_efer(const struct domain *d)
>  {
>      uint64_t val;
> @@ -1121,14 +1115,6 @@ static int write_msr(unsigned int reg, uint64_t val,
>              return X86EMUL_OKAY;
>          break;
>  
> -    case MSR_IA32_PERF_CTL:
> -        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
> -            break;
> -        if ( likely(!is_cpufreq_controller(currd)) ||
> -             wrmsr_safe(reg, val) == 0 )
> -            return X86EMUL_OKAY;
> -        break;
> -
>      case MSR_IA32_THERM_CONTROL:
>      case MSR_IA32_ENERGY_PERF_BIAS:
>          if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index d8ed83f869..b4d3e53310 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1069,6 +1069,23 @@ extern enum cpufreq_controller {
>      FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
>  } cpufreq_controller;
>  
> +static always_inline bool is_cpufreq_controller(const struct domain *d)
> +{
> +    /*
> +     * A PV dom0 can be nominated as the cpufreq controller, instead of using
> +     * Xen's cpufreq driver, at which point dom0 gets direct access to 
> certain
> +     * MSRs.
> +     *
> +     * This interface only works when dom0 is identity pinned and has the 
> same
> +     * number of vCPUs as pCPUs on the system.
> +     *
> +     * It would be far better to paravirtualise the interface.
> +     */

Would it be useful to add an assert here that the domain cpuid vendor
and the BSP cpuid vendor match?

Is it even possible to fake a different cpuid vendor for dom0?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.