[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] x86/tsx: Deprecate vpmu=rtm-abort and use tsx=<bool> instead


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 27 May 2021 19:24:06 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8TSa8Bj66WshHEcSwn6Xn7Gvf4WKJGEibxkzR9PXQjg=; b=AnWaBL9SuBcI2VMM4wZfvYcVXErH76xbKQIdhFdDbG0874hfACbfM8jmVY1CXjtzppVDbjgzBUm6YPVkpI1BfPNxO6mEcNeH+Og0yuCBFx6+C/+LIFq7pMtc4qahi0oQdZx92YRw5bozm+9jM29NWhuCh5E0rlJ60UHxZlMitCCbRvsk1zRtjIenYLPD3MHY+Ryu5TmIOVVE5tialvX4yyEspga1+Yl0zOJGe7wYHSHj9uGcHHxfJ3WOsRu3YeTbtvnqS7hga2eHi3qM0GVS+soHCUwbBBCA9q3GvgzZKaVAzgAmYISEXJ+OFvZldtxNCiR+0M8gXgvOjsC+jCckyg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EGHQPzIMkeUwX12ZCmgIP23T65pDkQzxz3MaibYtlEEw2/ZRdU16fSVxlbGtteTgB3a3cvRndUoYNlmSIPa/S6VmvZay5vO4JUHFGoV7Lt3Hmjs5M9ouYEX6BZoSfabuxSowcGyi6N7tpXICzym5a0y7+f3HS+o7Skyqt962TSsT4zmkaveP/XJ0FilKGbDa3IR/CHZm8Rw3KVkGqvvLeps9kfLsGyIknvfdlrUZXUSlgwcWkPOsQFzo/n3V/Y5tZTSTcqMSGkMgeLHC2J45UdoNiyr/JrF0LiTzwSL162VgI8b46P34d6NWOHeRIjpZj/IB5ueKlzB1SbUuocHRgg==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Thu, 27 May 2021 17:24:26 +0000
  • Ironport-hdrordr: A9a23:WfvyuK1fe717L2cq01cvpQqjBEIkLtp133Aq2lEZdPU0SKGlfq GV7ZMmPHrP4gr5N0tOpTntAse9qDbnhP1ICOoqTNOftWvd2FdARbsKheffKn/bak/DH4Zmvp uIGJIObeEYY2IasS77ijPIb+rJwrO8gd+VbTG19QYSceloAZsQnjuQEmygYytLrJEtP+tCKH KbjPA33gaISDAsQemQIGIKZOTHr82jruOaXfZXbyRXkDVnlFmTmcXHLyQ=
  • Ironport-sdr: qH/sxLvl1LkqLzZvQBAmyDzjpakHpLvCPVDj9CggC7WD1Wi3etZDPmZJUy25TeZoMk17EZaIXI T6HygTj5bGJ1xeI6LW64IigMkYXMxYQ0p/lFXUPW7yeSbEkfWIYVzL2JxVnXsR8YLvy+dk6y8G 50ZMCqNqAtNhFA5joyQY/vWuRhqTp4NuXmu1f7lzWfnKZ3XfrECjBl7npgExsxTbZ1dEa55aur w7Fl0NSnseoq0/JaipUAx71DJKr2PqdKAuDc31J22VoUQBwizsBH3WoBfKydTLGiafGOmsLnfc ji4=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, May 27, 2021 at 02:25:19PM +0100, Andrew Cooper wrote:
> This reuses the rtm_disable infrastructure, so CPUID derivation works properly
> when TSX is disabled in favour of working PCR3.
> 
> vpmu= is not a supported feature, and having this functionality under tsx=
> centralises all TSX handling.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> ---
> CC: Jan Beulich <JBeulich@xxxxxxxx>
> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> CC: Wei Liu <wl@xxxxxxx>
> ---
>  docs/misc/xen-command-line.pandoc | 40 +++++++++++++++---------------
>  xen/arch/x86/cpu/intel.c          |  3 ---
>  xen/arch/x86/cpu/vpmu.c           |  4 +--
>  xen/arch/x86/tsx.c                | 51 
> +++++++++++++++++++++++++++++++++++++--
>  xen/include/asm-x86/vpmu.h        |  1 -
>  5 files changed, 70 insertions(+), 29 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc 
> b/docs/misc/xen-command-line.pandoc
> index c32a397a12..a6facc33ea 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -2296,14 +2296,21 @@ pages) must also be specified via the tbuf_size 
> parameter.
>  
>  Controls for the use of Transactional Synchronization eXtensions.
>  
> -On Intel parts released in Q3 2019 (with updated microcode), and future 
> parts,
> -a control has been introduced which allows TSX to be turned off.
> +Several microcode updates are relevant:
>  
> -On systems with the ability to turn TSX off, this boolean offers system wide
> -control of whether TSX is enabled or disabled.
> + * March 2019, fixing the TSX memory ordering errata on all TSX-enabled CPUs
> +   to date.  Introduced MSR_TSX_FORCE_ABORT on SKL/SKX/KBL/WHL/CFL parts.  
> The
> +   errata workaround uses Performance Counter 3, so the user can select
> +   between working TSX and working perfcounters.
>  
> -On parts vulnerable to CVE-2019-11135 / TSX Asynchronous Abort, the following
> -logic applies:
> + * November 2019, fixing the TSX Async Abort speculative vulnerability.
> +   Introduced MSR_TSX_CTRL on all TSX-enabled MDS_NO parts to date,
> +   CLX/WHL-R/CFL-R, with the controls becoming architectural moving forward
> +   and formally retiring HLE from the architecture.  The user can disable TSX
> +   to mitigate TAA, and elect to hide the HLE/RTM CPUID bits.
> +
> +On systems with the ability to disable TSX off, this boolean offers system
> +wide control of whether TSX is enabled or disabled.
>  
>   * An explicit `tsx=` choice is honoured, even if it is `true` and would
>     result in a vulnerable system.
> @@ -2311,10 +2318,14 @@ logic applies:
>   * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be
>     mitigated by disabling TSX, as this is the lowest overhead option.
>  
> - * If the use of TSX is important, the more expensive TAA mitigations can be
> +   If the use of TSX is important, the more expensive TAA mitigations can be
>     opted in to with `smt=0 spec-ctrl=md-clear`, at which point TSX will 
> remain
>     active by default.
>  
> + * When no explicit `tsx=` option is given, parts susceptible to the memory
> +   ordering errata default to `true` to enable working TSX.  Alternatively,
> +   selecting `tsx=0` will disable TSX and restore PCR3 to a working state.
> +
>  ### ucode
>  > `= List of [ <integer> | scan=<bool>, nmi=<bool>, allow-same=<bool> ]`
>  
> @@ -2456,20 +2467,7 @@ provide access to a wealth of low level processor 
> information.
>  
>  *   The `arch` option allows access to the pre-defined architectural events.
>  
> -*   The `rtm-abort` boolean controls a trade-off between working Restricted
> -    Transactional Memory, and working performance counters.
> -
> -    All processors released to date (Q1 2019) supporting Transactional Memory
> -    Extensions suffer an erratum which has been addressed in microcode.
> -
> -    Processors based on the Skylake microarchitecture with up-to-date
> -    microcode internally use performance counter 3 to work around the 
> erratum.
> -    A consequence is that the counter gets reprogrammed whenever an `XBEGIN`
> -    instruction is executed.
> -
> -    An alternative mode exists where PCR3 behaves as before, at the cost of
> -    `XBEGIN` unconditionally aborting.  Enabling `rtm-abort` mode will
> -    activate this alternative mode.
> +*   The `rtm-abort` boolean has been superseded.  Use `tsx=0` instead.
>  
>  *Warning:*
>  As the virtualisation is not 100% safe, don't use the vpmu flag on
> diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
> index 37439071d9..abf8e206d7 100644
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -356,9 +356,6 @@ static void Intel_errata_workarounds(struct cpuinfo_x86 
> *c)
>           (c->x86_model == 29 || c->x86_model == 46 || c->x86_model == 47))
>               __set_bit(X86_FEATURE_CLFLUSH_MONITOR, c->x86_capability);
>  
> -     if (cpu_has_tsx_force_abort && opt_rtm_abort)
> -             wrmsrl(MSR_TSX_FORCE_ABORT, TSX_FORCE_ABORT_RTM);
> -
>       probe_c3_errata(c);
>  }
>  
> diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
> index d8659c63f8..16e91a3694 100644
> --- a/xen/arch/x86/cpu/vpmu.c
> +++ b/xen/arch/x86/cpu/vpmu.c
> @@ -49,7 +49,6 @@ CHECK_pmu_params;
>  static unsigned int __read_mostly opt_vpmu_enabled;
>  unsigned int __read_mostly vpmu_mode = XENPMU_MODE_OFF;
>  unsigned int __read_mostly vpmu_features = 0;
> -bool __read_mostly opt_rtm_abort;
>  
>  static DEFINE_SPINLOCK(vpmu_lock);
>  static unsigned vpmu_count;
> @@ -79,7 +78,8 @@ static int __init parse_vpmu_params(const char *s)
>          else if ( !cmdline_strcmp(s, "arch") )
>              vpmu_features |= XENPMU_FEATURE_ARCH_ONLY;
>          else if ( (val = parse_boolean("rtm-abort", s, ss)) >= 0 )
> -            opt_rtm_abort = val;
> +            printk(XENLOG_WARNING
> +                   "'rtm-abort=<bool>' superseded.  Use 'tsx=<bool>' 
> instead\n");
>          else
>              rc = -EINVAL;
>  
> diff --git a/xen/arch/x86/tsx.c b/xen/arch/x86/tsx.c
> index 98ecb71a4a..338191df7f 100644
> --- a/xen/arch/x86/tsx.c
> +++ b/xen/arch/x86/tsx.c
> @@ -6,7 +6,9 @@
>   * Valid values:
>   *   1 => Explicit tsx=1
>   *   0 => Explicit tsx=0
> - *  -1 => Default, implicit tsx=1, may change to 0 to mitigate TAA
> + *  -1 => Default, altered to 0/1 (if unspecified) by:
> + *                 - TAA heuristics/settings for speculative safety
> + *                 - "TSX vs PCR3" select for TSX memory ordering safety
>   *  -3 => Implicit tsx=1 (feed-through from spec-ctrl=0)
>   *
>   * This is arranged such that the bottom bit encodes whether TSX is actually
> @@ -50,6 +52,26 @@ void tsx_init(void)
>  
>          cpu_has_tsx_ctrl = !!(caps & ARCH_CAPS_TSX_CTRL);
>  
> +        if ( cpu_has_tsx_force_abort )
> +        {
> +            /*
> +             * On an early TSX-enable Skylake part subject to the memory
> +             * ordering erratum, with at least the March 2019 microcode.
> +             */
> +
> +            /*
> +             * If no explicit tsx= option is provided, pick a default.
> +             *
> +             * This deliberately overrides the implicit opt_tsx=-3 from
> +             * `spec-ctrl=0` because:
> +             * - parse_spec_ctrl() ran before any CPU details where know.
> +             * - We now know we're running on a CPU not affected by TAA (as
> +             *   TSX_FORCE_ABORT is enumerated).
> +             */
> +            if ( opt_tsx < 0 )
> +                opt_tsx = 1;
> +        }
> +
>          /*
>           * The TSX features (HLE/RTM) are handled specially.  They both
>           * enumerate features but, on certain parts, have mechanisms to be
> @@ -75,6 +97,12 @@ void tsx_init(void)
>          }
>      }
>  
> +    /*
> +     * Note: MSR_TSX_CTRL is enumerated on TSX-enabled MDS_NO and later 
> parts.
> +     * MSR_TSX_FORCE_ABORT is enumerated on TSX-enabled pre-MDS_NO Skylake
> +     * parts only.  The two features are on a disjoint set of CPUs, and not
> +     * offered to guests by hypervisors.
> +     */
>      if ( cpu_has_tsx_ctrl )
>      {
>          uint32_t hi, lo;
> @@ -90,9 +118,28 @@ void tsx_init(void)
>  
>          wrmsr(MSR_TSX_CTRL, lo, hi);
>      }
> +    else if ( cpu_has_tsx_force_abort )
> +    {
> +        /*
> +         * On an early TSX-enable Skylake part subject to the memory ordering
> +         * erratum, with at least the March 2019 microcode.
> +         */
> +        uint32_t hi, lo;
> +
> +        rdmsr(MSR_TSX_FORCE_ABORT, lo, hi);
> +
> +        /* Check bottom bit only.  Higher bits are various sentinels. */
> +        rtm_disabled = !(opt_tsx & 1);

I think you also calculate rtm_disabled in the previous if case
(cpu_has_tsx_ctrl), maybe that could be pulled out?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.