|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v10] x86emul: support LKGS
On 08/04/2026 11:22 am, Jan Beulich wrote:
> Provide support for this insn, which is a prereq to FRED. CPUID-wise,
> while its and FRED's enumerators were already introduced, their dependency
> still needs adding.
>
> While adding a testcase, also add a SWAPGS one. In order to not affect
> the behavior of pre-existing tests, install write_{segment,msr} hooks
> only transiently.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> For PV save_segments() would need adjustment,
Not really. CPL3 must never have a way of modifying GS_KERN, hence ...
> but the insn being restricted to ring 0 means PV guests can't use it anyway
... the CPL0 restriction.
Arguably I should have had this in one of the FRED patches:
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 1151997758c6..3364e774ada7 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1952,7 +1952,7 @@ static void load_segments(struct vcpu *n)
* changes to bases can also be made with the WR{FS,GS}BASE instructions, when
* enabled.
*
- * Guests however cannot use SWAPGS, so there is no mechanism to modify the
+ * Guests cannot use SWAPGS or LKGS, so there is no mechanism to modify the
* inactive GS base behind Xen's back. Therefore, Xen's copy of the inactive
* GS base is still accurate, and doesn't need reading back from hardware.
*
but I don't think it's appropriate to merge into this patch.
> (unless we wanted to emulate it as another privileged insn).
We already have "LKGS" in hypercall form. It's spelt
SEGBASE_GS_USER_SEL and has existed for 20 years or so.
I don't see any reason to extend emul_priv_op().
>
> I've also dropped the test harness read_segment() change. It generally
> would be correct to have, but isn't needed anymore with neither SWAPGS
> nor LKGS handling using the hook.
Dropping read_segment() makes your patch depend on Teddy's, now that
test_x86_emulator is blocking in CI.
This matters for backports. I expect I'll be backporting guest support
in not-too-long.
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -2899,8 +2899,37 @@ x86_emulate(
> break;
> }
> break;
> - default:
> - generate_exception_if(true, X86_EXC_UD);
> +
> + case 6: /* lkgs */
> + generate_exception_if((modrm_reg & 1) || vex.pfx != vex_f2,
> + X86_EXC_UD);
> + generate_exception_if(!mode_64bit() || !mode_ring0(),
> X86_EXC_UD);
> + vcpu_must_have(lkgs);
> + fail_if(!ops->read_msr || !ops->write_segment ||
> !ops->write_msr);
> + if ( (rc = ops->read_msr(MSR_SHADOW_GS_BASE, &msr_val,
> + ctxt)) != X86EMUL_OKAY ||
> + (rc = ops->read_msr(MSR_GS_BASE, &sreg.base,
> + ctxt)) != X86EMUL_OKAY )
> + goto done;
> + dst.orig_val = sreg.base; /* Preserve full GS Base. */
"Preserve current GS Base."
> + if ( (rc = protmode_load_seg(x86_seg_gs, src.val, false, &sreg,
> + ctxt, ops)) != X86EMUL_OKAY )
> + goto done;
> + /* Write (32-bit) base into SHADOW_GS. */
"Write new base into SHADOW_GS. Zero extended from GDT/LDT."
> + if ( (rc = ops->write_msr(MSR_SHADOW_GS_BASE, sreg.base,
> + ctxt, false)) != X86EMUL_OKAY ||
> + (sreg.base = dst.orig_val, /* Reinstate full GS Base. */
"Reinstate original GS base."
> + (rc = ops->write_segment(x86_seg_gs, &sreg,
> + ctxt)) != X86EMUL_OKAY) )
> + {
> + /*
> + * In real hardware, access to the registers cannot fail. It
> + * is an error in Xen if the writes fail.
> + */
> + ASSERT_UNREACHABLE();
> + x86_emul_reset_event(ctxt);
> + generate_exception(X86_EXC_DF, 0);
> + }
> break;
> }
> break;
This patch needs one more hunk:
diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 5273fe0ae435..9d8195e2ae56 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -765,14 +765,25 @@ static void __init calculate_hvm_max_policy(void)
*/
__set_bit(X86_FEATURE_NO_LMSL, fs);
- /*
- * On AMD, PV guests are entirely unable to use SYSENTER as Xen runs in
- * long mode (and init_amd() has cleared it out of host
capabilities), but
- * HVM guests are able if running in protected mode.
- */
- if ( (boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) &&
- raw_cpu_policy.basic.sep )
- __set_bit(X86_FEATURE_SEP, fs);
+ if ( boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
+ {
+ /*
+ * On AMD, PV guests are unable to use SYSENTER as Xen runs in long
+ * mode (and init_amd() has cleared it out of host
capabilities), but
+ * HVM guests are able if running in protected mode.
+ */
+ if ( raw_cpu_policy.basic.sep )
+ __set_bit(X86_FEATURE_SEP, fs);
+
+ /*
+ * NullSelectorClearsBase is really a "hardware doesn't have
this bug
+ * any more" bit. All FRED-capable hardware has NSCB
properties, so
+ * disallow a configuration which suggest/causes behaviour the
OS isn't
+ * expecting.
+ */
+ if ( !test_bit(X86_FEATURE_NSCB, fs) )
+ __clear_bit(X86_FEATURE_LKGS, fs);
+ }
/*
* VIRT_SSBD is exposed in the default policy as a result of
because otherwise a CPU Policy could hide NCSB and LKGS would be have
correctly when executed normally but malfunction in the emulator.
This hunk is in lieu of having vendor-dependent deep-deps calculations,
although it would need duplicating in userspace too.
Because this is only a link between an AMD-only feature and a common
feature, I think I can express it by only having a per-vendor
deep_features bitmap and keeping a shared deep_deps matrix.
Perhaps I should prototype that instead, but it would become another
dependency for this patch.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |