|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v10] x86emul: support LKGS
On 08.04.2026 13:34, Andrew Cooper wrote:
> On 08/04/2026 11:22 am, Jan Beulich wrote:
>> ---
>> For PV save_segments() would need adjustment,
>
> Not really. CPL3 must never have a way of modifying GS_KERN, hence ...
>
>> but the insn being restricted to ring 0 means PV guests can't use it anyway
>
> ... the CPL0 restriction.
>
> Arguably I should have had this in one of the FRED patches:
>
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1952,7 +1952,7 @@ static void load_segments(struct vcpu *n)
> * changes to bases can also be made with the WR{FS,GS}BASE instructions,
> when
> * enabled.
> *
> - * Guests however cannot use SWAPGS, so there is no mechanism to modify the
> + * Guests cannot use SWAPGS or LKGS, so there is no mechanism to modify the
> * inactive GS base behind Xen's back. Therefore, Xen's copy of the inactive
> * GS base is still accurate, and doesn't need reading back from hardware.
> *
>
>
> but I don't think it's appropriate to merge into this patch.
>
>> (unless we wanted to emulate it as another privileged insn).
>
> We already have "LKGS" in hypercall form. It's spelt
> SEGBASE_GS_USER_SEL and has existed for 20 years or so.
Hmm, yes.
> I don't see any reason to extend emul_priv_op().
Nor do I. Nevertheless I wanted to mention the PV aspect.
>> I've also dropped the test harness read_segment() change. It generally
>> would be correct to have, but isn't needed anymore with neither SWAPGS
>> nor LKGS handling using the hook.
>
> Dropping read_segment() makes your patch depend on Teddy's, now that
> test_x86_emulator is blocking in CI.
I'm not dropping read_segment() from there. I've dropped a change to
that function that v9 had. That depends on your change (which has gone
in), but not Teddy's. Or else I may not understand what you mean.
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -2899,8 +2899,37 @@ x86_emulate(
>> break;
>> }
>> break;
>> - default:
>> - generate_exception_if(true, X86_EXC_UD);
>> +
>> + case 6: /* lkgs */
>> + generate_exception_if((modrm_reg & 1) || vex.pfx != vex_f2,
>> + X86_EXC_UD);
>> + generate_exception_if(!mode_64bit() || !mode_ring0(),
>> X86_EXC_UD);
>> + vcpu_must_have(lkgs);
>> + fail_if(!ops->read_msr || !ops->write_segment ||
>> !ops->write_msr);
>> + if ( (rc = ops->read_msr(MSR_SHADOW_GS_BASE, &msr_val,
>> + ctxt)) != X86EMUL_OKAY ||
>> + (rc = ops->read_msr(MSR_GS_BASE, &sreg.base,
>> + ctxt)) != X86EMUL_OKAY )
>> + goto done;
>> + dst.orig_val = sreg.base; /* Preserve full GS Base. */
>
> "Preserve current GS Base."
>
>> + if ( (rc = protmode_load_seg(x86_seg_gs, src.val, false, &sreg,
>> + ctxt, ops)) != X86EMUL_OKAY )
>> + goto done;
>> + /* Write (32-bit) base into SHADOW_GS. */
>
> "Write new base into SHADOW_GS. Zero extended from GDT/LDT."
>
>> + if ( (rc = ops->write_msr(MSR_SHADOW_GS_BASE, sreg.base,
>> + ctxt, false)) != X86EMUL_OKAY ||
>> + (sreg.base = dst.orig_val, /* Reinstate full GS Base. */
>
> "Reinstate original GS base."
I can make these adjustments, sure, yet I think my forms were clear enough.
> This patch needs one more hunk:
>
> --- a/xen/arch/x86/cpu-policy.c
> +++ b/xen/arch/x86/cpu-policy.c
> @@ -765,14 +765,25 @@ static void __init calculate_hvm_max_policy(void)
> */
> __set_bit(X86_FEATURE_NO_LMSL, fs);
>
> - /*
> - * On AMD, PV guests are entirely unable to use SYSENTER as Xen runs in
> - * long mode (and init_amd() has cleared it out of host
> capabilities), but
> - * HVM guests are able if running in protected mode.
> - */
> - if ( (boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) &&
> - raw_cpu_policy.basic.sep )
> - __set_bit(X86_FEATURE_SEP, fs);
> + if ( boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
> + {
> + /*
> + * On AMD, PV guests are unable to use SYSENTER as Xen runs in long
> + * mode (and init_amd() has cleared it out of host
> capabilities), but
> + * HVM guests are able if running in protected mode.
> + */
> + if ( raw_cpu_policy.basic.sep )
> + __set_bit(X86_FEATURE_SEP, fs);
> +
> + /*
> + * NullSelectorClearsBase is really a "hardware doesn't have
> this bug
> + * any more" bit. All FRED-capable hardware has NSCB
> properties, so
> + * disallow a configuration which suggest/causes behaviour the
> OS isn't
> + * expecting.
> + */
> + if ( !test_bit(X86_FEATURE_NSCB, fs) )
> + __clear_bit(X86_FEATURE_LKGS, fs);
> + }
>
> /*
> * VIRT_SSBD is exposed in the default policy as a result of
>
>
> because otherwise a CPU Policy could hide NCSB and LKGS would be have
> correctly when executed normally but malfunction in the emulator.
A policy cannot validly hide NSCB, as the flag - whichever way it is set -
describes how the underlying hardware works. We'd need to intercept and
emulate all selector loads to allow flag and hardware behavior to be out
of sync. I.e. what you say for LKGS would be true for all selector loads.
> This hunk is in lieu of having vendor-dependent deep-deps calculations,
> although it would need duplicating in userspace too.
>
> Because this is only a link between an AMD-only feature and a common
> feature, I think I can express it by only having a per-vendor
> deep_features bitmap and keeping a shared deep_deps matrix.
>
> Perhaps I should prototype that instead, but it would become another
> dependency for this patch.
Please do, albeit as per above I don't think it's truly a prereq to the
one here.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |