[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] x86/tsx: Deprecate vpmu=rtm-abort and use tsx=<bool> instead


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Thu, 27 May 2021 19:33:08 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=w1teCrWfmmiQ8bQ1vlAiJ+AFkxAwdFFR4+rekFGhQ2E=; b=A/OHQrS3QVpuWxctGMiy+IJL8JO2m0X2L++VN4gp3q4g0vakSkvZedbsRxPajwtAR6iPKP9yzwBQuLoKsO+rEMhOaf4oyxaAWR+37ejWOcpx3AX6xVoipy4JUqKfWIsnoXKcTaGF522oPn/obaMTgNjIpgPeCtAKeVhzU4wZ6sZGTBCg5/GuRdFEgukE8vCkih1wUaur7OA1CbeKwfm60sIUD0J99PMbQNVbk6+8mvAg+O5wC3KizSmV4fFaVw2Wd6/SQIyGsVKmy36i5OyaEKFKj5V8WIdVN9WYC6LUOIDS6XSaU7AY9sdj+y4mRfUPu5VI34KBA3DXlNoeM3zvSQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BlOuFGECeNXXnodsAuPL0ck6u+yJ5A3ncLqLz49CzAtYCDhosO+8qCyfV9atjCZCyeTuLUbfB9X0gFpe3e0HjTvosIFRv5XOHEdzgSGLDdBHZ8v9D7/4HUpLN52hGsnEc/66ouJjCtxJJR1yRWoLaxIvSqTYRmpRLU3N/mjy08ryS0Swr7ETNVdD2tcCIk192T9AOTHEgxl67uYXdfpow1xAaw5QWzXsr5FYyJfwpdwyIS+3K+Y3SVYpq0+YaViEEmXNg5rkK+sOU56J/oDKZbH8VGIec+x/2H6ZFnPg5+OVZEgFM924xqhqRQ+1j+V8rr2PJkzcEInKSjBgN/fCSQ==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Thu, 27 May 2021 18:33:28 +0000
  • Ironport-hdrordr: A9a23:wg7wpKxkSUIUooPQYVG+KrPw1r1zdoMgy1knxilNoHxuH/BwWf rPoB17726RtN91YhsdcL+7V5VoLUmzyXcX2/h1AV7BZniEhILAFugLgbcKqweKJ8SUzJ8+6U 4PSclD4N2bNykGsS75ijPIb+rJFrO8gd+VbeS19QYScelzAZsQiDuQkmygYzZLrA8tP+teKL OsovBpihCHYnotYsGyFhA+LpL+T42iruOeXfYebSRXkDWzsQ==
  • Ironport-sdr: zqe7dMuXvp04NXhAim1ACvqh9JQTXOGKuhJSoLdsP83fHClRcQM2y+E+7TO0Lj1pBNDQSAbciS oHF2cHFO8MpybRxSZPCo8IdRHMI1nEtvuDBiWiHZWpmic4c9OKS6WlQWLLOvPR67Y2GokY47Vy lZg5qpIELD5Rd3ccj2v96wNIx1xGiNzuoNTmOJ9kSFWZJCG1uXbnQIN4xuy/Kh4BM0NO/0OfoJ SAmA5Mwkxq63PGtiB6lNArW4q1Du8QTvzdkBCxrH3wZS5fUQJDT9AE814UGJBbFxNIKlnF39kg 8as=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27/05/2021 18:24, Roger Pau Monné wrote:
> On Thu, May 27, 2021 at 02:25:19PM +0100, Andrew Cooper wrote:
>> This reuses the rtm_disable infrastructure, so CPUID derivation works 
>> properly
>> when TSX is disabled in favour of working PCR3.
>>
>> vpmu= is not a supported feature, and having this functionality under tsx=
>> centralises all TSX handling.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Thanks.

>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> CC: Wei Liu <wl@xxxxxxx>
>> ---
>>  docs/misc/xen-command-line.pandoc | 40 +++++++++++++++---------------
>>  xen/arch/x86/cpu/intel.c          |  3 ---
>>  xen/arch/x86/cpu/vpmu.c           |  4 +--
>>  xen/arch/x86/tsx.c                | 51 
>> +++++++++++++++++++++++++++++++++++++--
>>  xen/include/asm-x86/vpmu.h        |  1 -
>>  5 files changed, 70 insertions(+), 29 deletions(-)
>>
>> diff --git a/docs/misc/xen-command-line.pandoc 
>> b/docs/misc/xen-command-line.pandoc
>> index c32a397a12..a6facc33ea 100644
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -2296,14 +2296,21 @@ pages) must also be specified via the tbuf_size 
>> parameter.
>>  
>>  Controls for the use of Transactional Synchronization eXtensions.
>>  
>> -On Intel parts released in Q3 2019 (with updated microcode), and future 
>> parts,
>> -a control has been introduced which allows TSX to be turned off.
>> +Several microcode updates are relevant:
>>  
>> -On systems with the ability to turn TSX off, this boolean offers system wide
>> -control of whether TSX is enabled or disabled.
>> + * March 2019, fixing the TSX memory ordering errata on all TSX-enabled CPUs
>> +   to date.  Introduced MSR_TSX_FORCE_ABORT on SKL/SKX/KBL/WHL/CFL parts.  
>> The
>> +   errata workaround uses Performance Counter 3, so the user can select
>> +   between working TSX and working perfcounters.
>>  
>> -On parts vulnerable to CVE-2019-11135 / TSX Asynchronous Abort, the 
>> following
>> -logic applies:
>> + * November 2019, fixing the TSX Async Abort speculative vulnerability.
>> +   Introduced MSR_TSX_CTRL on all TSX-enabled MDS_NO parts to date,
>> +   CLX/WHL-R/CFL-R, with the controls becoming architectural moving forward
>> +   and formally retiring HLE from the architecture.  The user can disable 
>> TSX
>> +   to mitigate TAA, and elect to hide the HLE/RTM CPUID bits.
>> +
>> +On systems with the ability to disable TSX off, this boolean offers system
>> +wide control of whether TSX is enabled or disabled.
>>  
>>   * An explicit `tsx=` choice is honoured, even if it is `true` and would
>>     result in a vulnerable system.
>> @@ -2311,10 +2318,14 @@ logic applies:
>>   * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be
>>     mitigated by disabling TSX, as this is the lowest overhead option.
>>  
>> - * If the use of TSX is important, the more expensive TAA mitigations can be
>> +   If the use of TSX is important, the more expensive TAA mitigations can be
>>     opted in to with `smt=0 spec-ctrl=md-clear`, at which point TSX will 
>> remain
>>     active by default.
>>  
>> + * When no explicit `tsx=` option is given, parts susceptible to the memory
>> +   ordering errata default to `true` to enable working TSX.  Alternatively,
>> +   selecting `tsx=0` will disable TSX and restore PCR3 to a working state.
>> +
>>  ### ucode
>>  > `= List of [ <integer> | scan=<bool>, nmi=<bool>, allow-same=<bool> ]`
>>  
>> @@ -2456,20 +2467,7 @@ provide access to a wealth of low level processor 
>> information.
>>  
>>  *   The `arch` option allows access to the pre-defined architectural events.
>>  
>> -*   The `rtm-abort` boolean controls a trade-off between working Restricted
>> -    Transactional Memory, and working performance counters.
>> -
>> -    All processors released to date (Q1 2019) supporting Transactional 
>> Memory
>> -    Extensions suffer an erratum which has been addressed in microcode.
>> -
>> -    Processors based on the Skylake microarchitecture with up-to-date
>> -    microcode internally use performance counter 3 to work around the 
>> erratum.
>> -    A consequence is that the counter gets reprogrammed whenever an `XBEGIN`
>> -    instruction is executed.
>> -
>> -    An alternative mode exists where PCR3 behaves as before, at the cost of
>> -    `XBEGIN` unconditionally aborting.  Enabling `rtm-abort` mode will
>> -    activate this alternative mode.
>> +*   The `rtm-abort` boolean has been superseded.  Use `tsx=0` instead.
>>  
>>  *Warning:*
>>  As the virtualisation is not 100% safe, don't use the vpmu flag on
>> diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
>> index 37439071d9..abf8e206d7 100644
>> --- a/xen/arch/x86/cpu/intel.c
>> +++ b/xen/arch/x86/cpu/intel.c
>> @@ -356,9 +356,6 @@ static void Intel_errata_workarounds(struct cpuinfo_x86 
>> *c)
>>          (c->x86_model == 29 || c->x86_model == 46 || c->x86_model == 47))
>>              __set_bit(X86_FEATURE_CLFLUSH_MONITOR, c->x86_capability);
>>  
>> -    if (cpu_has_tsx_force_abort && opt_rtm_abort)
>> -            wrmsrl(MSR_TSX_FORCE_ABORT, TSX_FORCE_ABORT_RTM);
>> -
>>      probe_c3_errata(c);
>>  }
>>  
>> diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
>> index d8659c63f8..16e91a3694 100644
>> --- a/xen/arch/x86/cpu/vpmu.c
>> +++ b/xen/arch/x86/cpu/vpmu.c
>> @@ -49,7 +49,6 @@ CHECK_pmu_params;
>>  static unsigned int __read_mostly opt_vpmu_enabled;
>>  unsigned int __read_mostly vpmu_mode = XENPMU_MODE_OFF;
>>  unsigned int __read_mostly vpmu_features = 0;
>> -bool __read_mostly opt_rtm_abort;
>>  
>>  static DEFINE_SPINLOCK(vpmu_lock);
>>  static unsigned vpmu_count;
>> @@ -79,7 +78,8 @@ static int __init parse_vpmu_params(const char *s)
>>          else if ( !cmdline_strcmp(s, "arch") )
>>              vpmu_features |= XENPMU_FEATURE_ARCH_ONLY;
>>          else if ( (val = parse_boolean("rtm-abort", s, ss)) >= 0 )
>> -            opt_rtm_abort = val;
>> +            printk(XENLOG_WARNING
>> +                   "'rtm-abort=<bool>' superseded.  Use 'tsx=<bool>' 
>> instead\n");
>>          else
>>              rc = -EINVAL;
>>  
>> diff --git a/xen/arch/x86/tsx.c b/xen/arch/x86/tsx.c
>> index 98ecb71a4a..338191df7f 100644
>> --- a/xen/arch/x86/tsx.c
>> +++ b/xen/arch/x86/tsx.c
>> @@ -6,7 +6,9 @@
>>   * Valid values:
>>   *   1 => Explicit tsx=1
>>   *   0 => Explicit tsx=0
>> - *  -1 => Default, implicit tsx=1, may change to 0 to mitigate TAA
>> + *  -1 => Default, altered to 0/1 (if unspecified) by:
>> + *                 - TAA heuristics/settings for speculative safety
>> + *                 - "TSX vs PCR3" select for TSX memory ordering safety
>>   *  -3 => Implicit tsx=1 (feed-through from spec-ctrl=0)
>>   *
>>   * This is arranged such that the bottom bit encodes whether TSX is actually
>> @@ -50,6 +52,26 @@ void tsx_init(void)
>>  
>>          cpu_has_tsx_ctrl = !!(caps & ARCH_CAPS_TSX_CTRL);
>>  
>> +        if ( cpu_has_tsx_force_abort )
>> +        {
>> +            /*
>> +             * On an early TSX-enable Skylake part subject to the memory
>> +             * ordering erratum, with at least the March 2019 microcode.
>> +             */
>> +
>> +            /*
>> +             * If no explicit tsx= option is provided, pick a default.
>> +             *
>> +             * This deliberately overrides the implicit opt_tsx=-3 from
>> +             * `spec-ctrl=0` because:
>> +             * - parse_spec_ctrl() ran before any CPU details where know.
>> +             * - We now know we're running on a CPU not affected by TAA (as
>> +             *   TSX_FORCE_ABORT is enumerated).
>> +             */
>> +            if ( opt_tsx < 0 )
>> +                opt_tsx = 1;
>> +        }
>> +
>>          /*
>>           * The TSX features (HLE/RTM) are handled specially.  They both
>>           * enumerate features but, on certain parts, have mechanisms to be
>> @@ -75,6 +97,12 @@ void tsx_init(void)
>>          }
>>      }
>>  
>> +    /*
>> +     * Note: MSR_TSX_CTRL is enumerated on TSX-enabled MDS_NO and later 
>> parts.
>> +     * MSR_TSX_FORCE_ABORT is enumerated on TSX-enabled pre-MDS_NO Skylake
>> +     * parts only.  The two features are on a disjoint set of CPUs, and not
>> +     * offered to guests by hypervisors.
>> +     */
>>      if ( cpu_has_tsx_ctrl )
>>      {
>>          uint32_t hi, lo;
>> @@ -90,9 +118,28 @@ void tsx_init(void)
>>  
>>          wrmsr(MSR_TSX_CTRL, lo, hi);
>>      }
>> +    else if ( cpu_has_tsx_force_abort )
>> +    {
>> +        /*
>> +         * On an early TSX-enable Skylake part subject to the memory 
>> ordering
>> +         * erratum, with at least the March 2019 microcode.
>> +         */
>> +        uint32_t hi, lo;
>> +
>> +        rdmsr(MSR_TSX_FORCE_ABORT, lo, hi);
>> +
>> +        /* Check bottom bit only.  Higher bits are various sentinels. */
>> +        rtm_disabled = !(opt_tsx & 1);
> I think you also calculate rtm_disabled in the previous if case
> (cpu_has_tsx_ctrl), maybe that could be pulled out?

rtm_disabled needs to not become true if !cpu_has_tsx_ctrl ||
!cpu_has_tsx_force_abort

Otherwise we'll default to disabling the CPUID bits even systems when we
can't we can't actually control TSX behaviour.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.