[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.19 v2] x86/spec-ctrl: Support for SRSO_US_NO and SRSO_MSR_FIX
On Wed, Jun 19, 2024 at 08:10:57PM +0100, Andrew Cooper wrote: > AMD have updated the SRSO whitepaper[1] with further information. > > There's a new SRSO_U/S_NO enumeration saying that SRSO attacks can't cross the > user/supervisor boundary. i.e. Xen don't need to use IBPB-on-entry for PV. > > There's also a new SRSO_MSR_FIX identifying that the BP_SPEC_REDUCE bit is > available in MSR_BP_CFG. When set, SRSO attacks can't cross the host/guest > boundary. i.e. Xen don't need to use IBPB-on-entry for HVM. > > Extend ibpb_calculations() to account for these when calculating > opt_ibpb_entry_{pv,hvm} defaults. Add a bp-spec-reduce option to control the > use of BP_SPEC_REDUCE, but activate it by default. > > Because MSR_BP_CFG is core-scoped with a race condition updating it, repurpose > amd_check_erratum_1485() into amd_check_bp_cfg() and calculate all updates at > once. > > Advertise SRSO_U/S_NO to guests whenever possible, as it allows the guest > kernel to skip SRSO protections too. This is easy for HVM guests, but hard > for PV guests, as both the guest userspace and kernel operate in CPL3. After > discussing with AMD, it is believed to be safe to advertise SRSO_U/S_NO to PV > guests when BP_SPEC_REDUCE is active. > > Fix a typo in the SRSO_NO's comment. > > [1] > https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > --- > CC: Jan Beulich <JBeulich@xxxxxxxx> > CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> > CC: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx> > > v2: > * Add "for HVM guests" to xen-command-line.pandoc > * Print details on boot > * Don't advertise SRSO_US_NO to PV guests if BP_SPEC_REDUCE isn't active. > > For 4.19. This should be no functional change on current hardware. On > forthcoming hardware, it avoids the substantial perf hits which were necessary > to protect against the SRSO speculative vulnerability. > --- > docs/misc/xen-command-line.pandoc | 9 +++- > xen/arch/x86/cpu-policy.c | 19 ++++++++ > xen/arch/x86/cpu/amd.c | 29 +++++++++--- > xen/arch/x86/include/asm/msr-index.h | 1 + > xen/arch/x86/include/asm/spec_ctrl.h | 1 + > xen/arch/x86/spec_ctrl.c | 49 ++++++++++++++++----- > xen/include/public/arch-x86/cpufeatureset.h | 4 +- > 7 files changed, 92 insertions(+), 20 deletions(-) > > diff --git a/docs/misc/xen-command-line.pandoc > b/docs/misc/xen-command-line.pandoc > index 1dea7431fab6..88beb64525d5 100644 > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -2390,7 +2390,7 @@ By default SSBD will be mitigated at runtime (i.e > `ssbd=runtime`). > > {ibrs,ibpb,ssbd,psfd, > > eager-fpu,l1d-flush,branch-harden,srb-lock, > > unpriv-mmio,gds-mit,div-scrub,lock-harden, > -> bhi-dis-s}=<bool> ]` > +> bhi-dis-s,bp-spec-reduce}=<bool> ]` > > Controls for speculative execution sidechannel mitigations. By default, Xen > will pick the most appropriate mitigations based on compiled in support, > @@ -2539,6 +2539,13 @@ boolean can be used to force or prevent Xen from using > speculation barriers to > protect lock critical regions. This mitigation won't be engaged by default, > and needs to be explicitly enabled on the command line. > > +On hardware supporting SRSO_MSR_FIX, the `bp-spec-reduce=` option can be used > +to force or prevent Xen from using MSR_BP_CFG.BP_SPEC_REDUCE to mitigate the > +SRSO (Speculative Return Stack Overflow) vulnerability. Xen will use > +bp-spec-reduce when available, as it is preferable to using `ibpb-entry=hvm` > +to mitigate SRSO for HVM guests, and because it is a necessary prerequisite > in > +order to advertise SRSO_U/S_NO to PV guests. > + > ### sync_console > > `= <boolean>` > > diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c > index 304dc20cfab8..fd32fe333384 100644 > --- a/xen/arch/x86/cpu-policy.c > +++ b/xen/arch/x86/cpu-policy.c > @@ -14,6 +14,7 @@ > #include <asm/msr-index.h> > #include <asm/paging.h> > #include <asm/setup.h> > +#include <asm/spec_ctrl.h> > #include <asm/xstate.h> > > struct cpu_policy __read_mostly raw_cpu_policy; > @@ -605,6 +606,24 @@ static void __init calculate_pv_max_policy(void) > __clear_bit(X86_FEATURE_IBRS, fs); > } > > + /* > + * SRSO_U/S_NO means that the CPU is not vulnerable to SRSO attacks > across > + * the User (CPL3)/Supervisor (CPL<3) boundary. However the PV64 > + * user/kernel boundary is CPL3 on both sides, so it won't convey the > + * meaning that a PV kernel expects. > + * > + * PV32 guests are explicitly unsupported WRT speculative safety, so are > + * ignored to avoid complicating the logic. > + * > + * After discussions with AMD, it is believed to be safe to offer > + * SRSO_US_NO to PV guests when BP_SPEC_REDUCE is active. > + * > + * If BP_SPEC_REDUCE isn't active, remove SRSO_U/S_NO from the PV max > + * policy, which will cause it to filter out of PV default too. > + */ > + if ( !boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX) || !opt_bp_spec_reduce ) > + __clear_bit(X86_FEATURE_SRSO_US_NO, fs); Do we need to clear X86_FEATURE_SRSO_US_NO unconditionally for PV32 guests in recalculate_cpuid_policy()? Seems simple enough, and should provide a more accurate policy in case we run a 32bit PV guest in shim mode for example? (which is supported IIRC) > + > guest_common_max_feature_adjustments(fs); > guest_common_feature_adjustments(fs); > > diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c > index ab92333673b9..5213dfff601d 100644 > --- a/xen/arch/x86/cpu/amd.c > +++ b/xen/arch/x86/cpu/amd.c > @@ -1009,16 +1009,33 @@ static void cf_check fam17_disable_c6(void *arg) > wrmsrl(MSR_AMD_CSTATE_CFG, val & mask); > } > > -static void amd_check_erratum_1485(void) > +static void amd_check_bp_cfg(void) > { > - uint64_t val, chickenbit = (1 << 5); > + uint64_t val, new = 0; > > - if (cpu_has_hypervisor || boot_cpu_data.x86 != 0x19 || !is_zen4_uarch()) > + /* > + * AMD Erratum #1485. Set bit 5, as instructed. > + */ > + if (!cpu_has_hypervisor && boot_cpu_data.x86 == 0x19 && is_zen4_uarch()) > + new |= (1 << 5); > + > + /* > + * On hardware supporting SRSO_MSR_FIX, activate BP_SPEC_REDUCE by > + * default. This lets us do two things: > + * > + * 1) Avoid IBPB-on-entry to mitigate SRSO attacks from HVM guests. > + * 2) Lets us advertise SRSO_US_NO to PV guests. The above two lines use white spaces instead of hard tabs. > + */ > + if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX) && opt_bp_spec_reduce) > + new |= BP_CFG_SPEC_REDUCE; > + > + /* Avoid reading BP_CFG if we don't intend to change anything. */ > + if (!new) > return; > > rdmsrl(MSR_AMD64_BP_CFG, val); > > - if (val & chickenbit) > + if ((val & new) == new) > return; > > /* > @@ -1027,7 +1044,7 @@ static void amd_check_erratum_1485(void) > * same time before the chickenbit is set. It's benign because the > * value being written is the same on both. > */ > - wrmsrl(MSR_AMD64_BP_CFG, val | chickenbit); > + wrmsrl(MSR_AMD64_BP_CFG, val | new); > } > > static void cf_check init_amd(struct cpuinfo_x86 *c) > @@ -1297,7 +1314,7 @@ static void cf_check init_amd(struct cpuinfo_x86 *c) > disable_c1_ramping(); > > amd_check_zenbleed(); > - amd_check_erratum_1485(); > + amd_check_bp_cfg(); > > if (fam17_c6_disabled) > fam17_disable_c6(NULL); > diff --git a/xen/arch/x86/include/asm/msr-index.h > b/xen/arch/x86/include/asm/msr-index.h > index 9cdb5b262566..83fbf4135c6b 100644 > --- a/xen/arch/x86/include/asm/msr-index.h > +++ b/xen/arch/x86/include/asm/msr-index.h > @@ -412,6 +412,7 @@ > #define AMD64_DE_CFG_LFENCE_SERIALISE (_AC(1, ULL) << 1) > #define MSR_AMD64_EX_CFG 0xc001102cU > #define MSR_AMD64_BP_CFG 0xc001102eU > +#define BP_CFG_SPEC_REDUCE (_AC(1, ULL) << 4) > #define MSR_AMD64_DE_CFG2 0xc00110e3U > > #define MSR_AMD64_DR0_ADDRESS_MASK 0xc0011027U > diff --git a/xen/arch/x86/include/asm/spec_ctrl.h > b/xen/arch/x86/include/asm/spec_ctrl.h > index 72347ef2b959..077225418956 100644 > --- a/xen/arch/x86/include/asm/spec_ctrl.h > +++ b/xen/arch/x86/include/asm/spec_ctrl.h > @@ -90,6 +90,7 @@ extern int8_t opt_xpti_hwdom, opt_xpti_domu; > > extern bool cpu_has_bug_l1tf; > extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; > +extern bool opt_bp_spec_reduce; > > /* > * The L1D address mask, which might be wider than reported in CPUID, and the > diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c > index 40f6ae017010..7aabb65ba028 100644 > --- a/xen/arch/x86/spec_ctrl.c > +++ b/xen/arch/x86/spec_ctrl.c > @@ -83,6 +83,7 @@ static bool __initdata opt_unpriv_mmio; > static bool __ro_after_init opt_verw_mmio; > static int8_t __initdata opt_gds_mit = -1; > static int8_t __initdata opt_div_scrub = -1; > +bool __ro_after_init opt_bp_spec_reduce = true; > > static int __init cf_check parse_spec_ctrl(const char *s) > { > @@ -143,6 +144,7 @@ static int __init cf_check parse_spec_ctrl(const char *s) > opt_unpriv_mmio = false; > opt_gds_mit = 0; > opt_div_scrub = 0; > + opt_bp_spec_reduce = false; > } > else if ( val > 0 ) > rc = -EINVAL; > @@ -363,6 +365,8 @@ static int __init cf_check parse_spec_ctrl(const char *s) > opt_gds_mit = val; > else if ( (val = parse_boolean("div-scrub", s, ss)) >= 0 ) > opt_div_scrub = val; > + else if ( (val = parse_boolean("bp-spec-reduce", s, ss)) >= 0 ) > + opt_bp_spec_reduce = val; > else > rc = -EINVAL; > > @@ -505,7 +509,7 @@ static void __init print_details(enum ind_thunk thunk) > * Hardware read-only information, stating immunity to certain issues, or > * suggestions of which mitigation to use. > */ > - printk(" Hardware > hints:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", > + printk(" Hardware > hints:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", > (caps & ARCH_CAPS_RDCL_NO) ? " RDCL_NO" > : "", > (caps & ARCH_CAPS_EIBRS) ? " EIBRS" > : "", > (caps & ARCH_CAPS_RSBA) ? " RSBA" > : "", > @@ -529,10 +533,11 @@ static void __init print_details(enum ind_thunk thunk) > (e8b & cpufeat_mask(X86_FEATURE_BTC_NO)) ? " BTC_NO" > : "", > (e8b & cpufeat_mask(X86_FEATURE_IBPB_RET)) ? " IBPB_RET" > : "", > (e21a & cpufeat_mask(X86_FEATURE_IBPB_BRTYPE)) ? " > IBPB_BRTYPE" : "", > - (e21a & cpufeat_mask(X86_FEATURE_SRSO_NO)) ? " SRSO_NO" > : ""); > + (e21a & cpufeat_mask(X86_FEATURE_SRSO_NO)) ? " SRSO_NO" > : "", > + (e21a & cpufeat_mask(X86_FEATURE_SRSO_US_NO)) ? " SRSO_US_NO" > : ""); > > /* Hardware features which need driving to mitigate issues. */ > - printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", > + printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", > (e8b & cpufeat_mask(X86_FEATURE_IBPB)) || > (_7d0 & cpufeat_mask(X86_FEATURE_IBRSB)) ? " IBPB" > : "", > (e8b & cpufeat_mask(X86_FEATURE_IBRS)) || > @@ -551,7 +556,8 @@ static void __init print_details(enum ind_thunk thunk) > (caps & ARCH_CAPS_FB_CLEAR_CTRL) ? " > FB_CLEAR_CTRL" : "", > (caps & ARCH_CAPS_GDS_CTRL) ? " GDS_CTRL" > : "", > (caps & ARCH_CAPS_RFDS_CLEAR) ? " RFDS_CLEAR" > : "", > - (e21a & cpufeat_mask(X86_FEATURE_SBPB)) ? " SBPB" > : ""); > + (e21a & cpufeat_mask(X86_FEATURE_SBPB)) ? " SBPB" > : "", > + (e21a & cpufeat_mask(X86_FEATURE_SRSO_MSR_FIX)) ? " > SRSO_MSR_FIX" : ""); > > /* Compiled-in support which pertains to mitigations. */ > if ( IS_ENABLED(CONFIG_INDIRECT_THUNK) || > IS_ENABLED(CONFIG_SHADOW_PAGING) || > @@ -1120,7 +1126,7 @@ static void __init div_calculations(bool hw_smt_enabled) > > static void __init ibpb_calculations(void) > { > - bool def_ibpb_entry = false; > + bool def_ibpb_entry_pv = false, def_ibpb_entry_hvm = false; > > /* Check we have hardware IBPB support before using it... */ > if ( !boot_cpu_has(X86_FEATURE_IBRSB) && !boot_cpu_has(X86_FEATURE_IBPB) > ) > @@ -1145,22 +1151,41 @@ static void __init ibpb_calculations(void) > * Confusion. Mitigate with IBPB-on-entry. > */ > if ( !boot_cpu_has(X86_FEATURE_BTC_NO) ) > - def_ibpb_entry = true; > + def_ibpb_entry_pv = def_ibpb_entry_hvm = true; > > /* > - * Further to BTC, Zen3/4 CPUs suffer from Speculative Return Stack > - * Overflow in most configurations. Mitigate with IBPB-on-entry if > we > - * have the microcode that makes this an effective option. > + * Further to BTC, Zen3 and later CPUs suffer from Speculative Return > + * Stack Overflow in most configurations. Mitigate with > IBPB-on-entry > + * if we have the microcode that makes this an effective option, > + * except where there are other mitigating factors available. > */ > if ( !boot_cpu_has(X86_FEATURE_SRSO_NO) && > boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) ) > - def_ibpb_entry = true; > + { > + /* > + * SRSO_U/S_NO is a subset of SRSO_NO, identifying that SRSO > isn't > + * possible across the user/supervisor boundary. We only need to > + * use IBPB-on-entry for PV guests on hardware which doesn't > + * enumerate SRSO_US_NO. > + */ > + if ( !boot_cpu_has(X86_FEATURE_SRSO_US_NO) ) > + def_ibpb_entry_pv = true; > + > + /* > + * SRSO_MSR_FIX enumerates that we can use MSR_BP_CFG.SPEC_REDUCE > + * to mitigate SRSO across the host/guest boundary. We only need > + * to use IBPB-on-entry for HVM guests if we haven't enabled this > + * control. > + */ > + if ( !boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX) || > !opt_bp_spec_reduce ) > + def_ibpb_entry_hvm = true; > + } > } > > if ( opt_ibpb_entry_pv == -1 ) > - opt_ibpb_entry_pv = IS_ENABLED(CONFIG_PV) && def_ibpb_entry; > + opt_ibpb_entry_pv = IS_ENABLED(CONFIG_PV) && def_ibpb_entry_pv; > if ( opt_ibpb_entry_hvm == -1 ) > - opt_ibpb_entry_hvm = IS_ENABLED(CONFIG_HVM) && def_ibpb_entry; > + opt_ibpb_entry_hvm = IS_ENABLED(CONFIG_HVM) && def_ibpb_entry_hvm; > > if ( opt_ibpb_entry_pv ) > { > diff --git a/xen/include/public/arch-x86/cpufeatureset.h > b/xen/include/public/arch-x86/cpufeatureset.h > index d9eba5e9a714..9c98e4992861 100644 > --- a/xen/include/public/arch-x86/cpufeatureset.h > +++ b/xen/include/public/arch-x86/cpufeatureset.h > @@ -312,7 +312,9 @@ XEN_CPUFEATURE(FSRSC, 11*32+19) /*A Fast > Short REP SCASB */ > XEN_CPUFEATURE(AMD_PREFETCHI, 11*32+20) /*A PREFETCHIT{0,1} > Instructions */ > XEN_CPUFEATURE(SBPB, 11*32+27) /*A Selective Branch Predictor > Barrier */ > XEN_CPUFEATURE(IBPB_BRTYPE, 11*32+28) /*A IBPB flushes Branch Type > predictions too */ > -XEN_CPUFEATURE(SRSO_NO, 11*32+29) /*A Hardware not vulenrable to > Speculative Return Stack Overflow */ > +XEN_CPUFEATURE(SRSO_NO, 11*32+29) /*A Hardware not vulnerable to > Speculative Return Stack Overflow */ > +XEN_CPUFEATURE(SRSO_US_NO, 11*32+30) /*A! Hardware not vulnerable to > SRSO across the User/Supervisor boundary */ Might we use SRSO_USER_KERNEL_NO instead of SRSO_US_NO, as that's the name used in the AMD whitepaper? With at least the white space indentation fixed: Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |