[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 3/3] x86/tsx: Deprecate vpmu=rtm-abort and use tsx=<bool> instead
On 27/05/2021 18:24, Roger Pau Monné wrote: > On Thu, May 27, 2021 at 02:25:19PM +0100, Andrew Cooper wrote: >> This reuses the rtm_disable infrastructure, so CPUID derivation works >> properly >> when TSX is disabled in favour of working PCR3. >> >> vpmu= is not a supported feature, and having this functionality under tsx= >> centralises all TSX handling. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> Thanks. > >> --- >> CC: Jan Beulich <JBeulich@xxxxxxxx> >> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> CC: Wei Liu <wl@xxxxxxx> >> --- >> docs/misc/xen-command-line.pandoc | 40 +++++++++++++++--------------- >> xen/arch/x86/cpu/intel.c | 3 --- >> xen/arch/x86/cpu/vpmu.c | 4 +-- >> xen/arch/x86/tsx.c | 51 >> +++++++++++++++++++++++++++++++++++++-- >> xen/include/asm-x86/vpmu.h | 1 - >> 5 files changed, 70 insertions(+), 29 deletions(-) >> >> diff --git a/docs/misc/xen-command-line.pandoc >> b/docs/misc/xen-command-line.pandoc >> index c32a397a12..a6facc33ea 100644 >> --- a/docs/misc/xen-command-line.pandoc >> +++ b/docs/misc/xen-command-line.pandoc >> @@ -2296,14 +2296,21 @@ pages) must also be specified via the tbuf_size >> parameter. >> >> Controls for the use of Transactional Synchronization eXtensions. >> >> -On Intel parts released in Q3 2019 (with updated microcode), and future >> parts, >> -a control has been introduced which allows TSX to be turned off. >> +Several microcode updates are relevant: >> >> -On systems with the ability to turn TSX off, this boolean offers system wide >> -control of whether TSX is enabled or disabled. >> + * March 2019, fixing the TSX memory ordering errata on all TSX-enabled CPUs >> + to date. Introduced MSR_TSX_FORCE_ABORT on SKL/SKX/KBL/WHL/CFL parts. >> The >> + errata workaround uses Performance Counter 3, so the user can select >> + between working TSX and working perfcounters. >> >> -On parts vulnerable to CVE-2019-11135 / TSX Asynchronous Abort, the >> following >> -logic applies: >> + * November 2019, fixing the TSX Async Abort speculative vulnerability. >> + Introduced MSR_TSX_CTRL on all TSX-enabled MDS_NO parts to date, >> + CLX/WHL-R/CFL-R, with the controls becoming architectural moving forward >> + and formally retiring HLE from the architecture. The user can disable >> TSX >> + to mitigate TAA, and elect to hide the HLE/RTM CPUID bits. >> + >> +On systems with the ability to disable TSX off, this boolean offers system >> +wide control of whether TSX is enabled or disabled. >> >> * An explicit `tsx=` choice is honoured, even if it is `true` and would >> result in a vulnerable system. >> @@ -2311,10 +2318,14 @@ logic applies: >> * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be >> mitigated by disabling TSX, as this is the lowest overhead option. >> >> - * If the use of TSX is important, the more expensive TAA mitigations can be >> + If the use of TSX is important, the more expensive TAA mitigations can be >> opted in to with `smt=0 spec-ctrl=md-clear`, at which point TSX will >> remain >> active by default. >> >> + * When no explicit `tsx=` option is given, parts susceptible to the memory >> + ordering errata default to `true` to enable working TSX. Alternatively, >> + selecting `tsx=0` will disable TSX and restore PCR3 to a working state. >> + >> ### ucode >> > `= List of [ <integer> | scan=<bool>, nmi=<bool>, allow-same=<bool> ]` >> >> @@ -2456,20 +2467,7 @@ provide access to a wealth of low level processor >> information. >> >> * The `arch` option allows access to the pre-defined architectural events. >> >> -* The `rtm-abort` boolean controls a trade-off between working Restricted >> - Transactional Memory, and working performance counters. >> - >> - All processors released to date (Q1 2019) supporting Transactional >> Memory >> - Extensions suffer an erratum which has been addressed in microcode. >> - >> - Processors based on the Skylake microarchitecture with up-to-date >> - microcode internally use performance counter 3 to work around the >> erratum. >> - A consequence is that the counter gets reprogrammed whenever an `XBEGIN` >> - instruction is executed. >> - >> - An alternative mode exists where PCR3 behaves as before, at the cost of >> - `XBEGIN` unconditionally aborting. Enabling `rtm-abort` mode will >> - activate this alternative mode. >> +* The `rtm-abort` boolean has been superseded. Use `tsx=0` instead. >> >> *Warning:* >> As the virtualisation is not 100% safe, don't use the vpmu flag on >> diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c >> index 37439071d9..abf8e206d7 100644 >> --- a/xen/arch/x86/cpu/intel.c >> +++ b/xen/arch/x86/cpu/intel.c >> @@ -356,9 +356,6 @@ static void Intel_errata_workarounds(struct cpuinfo_x86 >> *c) >> (c->x86_model == 29 || c->x86_model == 46 || c->x86_model == 47)) >> __set_bit(X86_FEATURE_CLFLUSH_MONITOR, c->x86_capability); >> >> - if (cpu_has_tsx_force_abort && opt_rtm_abort) >> - wrmsrl(MSR_TSX_FORCE_ABORT, TSX_FORCE_ABORT_RTM); >> - >> probe_c3_errata(c); >> } >> >> diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c >> index d8659c63f8..16e91a3694 100644 >> --- a/xen/arch/x86/cpu/vpmu.c >> +++ b/xen/arch/x86/cpu/vpmu.c >> @@ -49,7 +49,6 @@ CHECK_pmu_params; >> static unsigned int __read_mostly opt_vpmu_enabled; >> unsigned int __read_mostly vpmu_mode = XENPMU_MODE_OFF; >> unsigned int __read_mostly vpmu_features = 0; >> -bool __read_mostly opt_rtm_abort; >> >> static DEFINE_SPINLOCK(vpmu_lock); >> static unsigned vpmu_count; >> @@ -79,7 +78,8 @@ static int __init parse_vpmu_params(const char *s) >> else if ( !cmdline_strcmp(s, "arch") ) >> vpmu_features |= XENPMU_FEATURE_ARCH_ONLY; >> else if ( (val = parse_boolean("rtm-abort", s, ss)) >= 0 ) >> - opt_rtm_abort = val; >> + printk(XENLOG_WARNING >> + "'rtm-abort=<bool>' superseded. Use 'tsx=<bool>' >> instead\n"); >> else >> rc = -EINVAL; >> >> diff --git a/xen/arch/x86/tsx.c b/xen/arch/x86/tsx.c >> index 98ecb71a4a..338191df7f 100644 >> --- a/xen/arch/x86/tsx.c >> +++ b/xen/arch/x86/tsx.c >> @@ -6,7 +6,9 @@ >> * Valid values: >> * 1 => Explicit tsx=1 >> * 0 => Explicit tsx=0 >> - * -1 => Default, implicit tsx=1, may change to 0 to mitigate TAA >> + * -1 => Default, altered to 0/1 (if unspecified) by: >> + * - TAA heuristics/settings for speculative safety >> + * - "TSX vs PCR3" select for TSX memory ordering safety >> * -3 => Implicit tsx=1 (feed-through from spec-ctrl=0) >> * >> * This is arranged such that the bottom bit encodes whether TSX is actually >> @@ -50,6 +52,26 @@ void tsx_init(void) >> >> cpu_has_tsx_ctrl = !!(caps & ARCH_CAPS_TSX_CTRL); >> >> + if ( cpu_has_tsx_force_abort ) >> + { >> + /* >> + * On an early TSX-enable Skylake part subject to the memory >> + * ordering erratum, with at least the March 2019 microcode. >> + */ >> + >> + /* >> + * If no explicit tsx= option is provided, pick a default. >> + * >> + * This deliberately overrides the implicit opt_tsx=-3 from >> + * `spec-ctrl=0` because: >> + * - parse_spec_ctrl() ran before any CPU details where know. >> + * - We now know we're running on a CPU not affected by TAA (as >> + * TSX_FORCE_ABORT is enumerated). >> + */ >> + if ( opt_tsx < 0 ) >> + opt_tsx = 1; >> + } >> + >> /* >> * The TSX features (HLE/RTM) are handled specially. They both >> * enumerate features but, on certain parts, have mechanisms to be >> @@ -75,6 +97,12 @@ void tsx_init(void) >> } >> } >> >> + /* >> + * Note: MSR_TSX_CTRL is enumerated on TSX-enabled MDS_NO and later >> parts. >> + * MSR_TSX_FORCE_ABORT is enumerated on TSX-enabled pre-MDS_NO Skylake >> + * parts only. The two features are on a disjoint set of CPUs, and not >> + * offered to guests by hypervisors. >> + */ >> if ( cpu_has_tsx_ctrl ) >> { >> uint32_t hi, lo; >> @@ -90,9 +118,28 @@ void tsx_init(void) >> >> wrmsr(MSR_TSX_CTRL, lo, hi); >> } >> + else if ( cpu_has_tsx_force_abort ) >> + { >> + /* >> + * On an early TSX-enable Skylake part subject to the memory >> ordering >> + * erratum, with at least the March 2019 microcode. >> + */ >> + uint32_t hi, lo; >> + >> + rdmsr(MSR_TSX_FORCE_ABORT, lo, hi); >> + >> + /* Check bottom bit only. Higher bits are various sentinels. */ >> + rtm_disabled = !(opt_tsx & 1); > I think you also calculate rtm_disabled in the previous if case > (cpu_has_tsx_ctrl), maybe that could be pulled out? rtm_disabled needs to not become true if !cpu_has_tsx_ctrl || !cpu_has_tsx_force_abort Otherwise we'll default to disabling the CPUID bits even systems when we can't we can't actually control TSX behaviour. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |