[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v10] x86emul: support LKGS


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 8 Apr 2026 12:34:00 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=hDRpph66zub1viTKCSzkRUJG06bJCnyd+BNdnuHGBwM=; b=xrkk+XyPEzgyzpxI01wNw0FKuaWPNu3XwFS1GXwOSx1SyhvMkCq6OHjOBA7NbsciNPkVTw1LQXuNC6ljyXa38o8zyR4/k60pQ5OKEoYeloceXjV4AvcOu6l7mhd9t318anSlIbnnrlul1PCaW0SV6+3UMdAyELPwXDNkbyy75ZA+7bvsDnvdF41c81AUhw9tbIL/aQa3UFoTgtBf2E10Q0Ps4gJoOq3lC8aMBEc7W/+aKao35fAsSCC/LGqkE+PXoD3QiGMH2Iyr+gf+G3r0Ge+8OqGY9EIbC1NNDvR3sv0Yqsk2g+n7PoN4pwuQ0BPGFhNrHg2KuhA4n1qFpbFSIQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=JUVQDh+x+x7lq0K1LUyvkI9BN/CFMeVjzmIyvmpX9D+DtB82pAo3be2nhAL5Zf0KuHJmAebiNnC5ukNbJ9citnXd44mnUFeSN8FNcj2A+BtkOuSZiWSFbulZoOArvjb0tJIzLYD36Mu74N1Zj2aPXLtYC6ZwchElRmDeXkW1l119jOnnf9pOZ51nukIOcfP0qJFsp3LQwY2LOa36fZ4KgwRuL3DMlQ1nE9AvxVQ0tEYkalwyD+Zlg0Is1o7JvGK7qXQNmAv5MBMCFmqvRxsonRUX5EyzaHq7Df7K23AKMyLMzmQy8fFzRj+CTph9kNgG1QDH0W677kCcLT8t0xu8ag==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>
  • Delivery-date: Wed, 08 Apr 2026 11:34:12 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 08/04/2026 11:22 am, Jan Beulich wrote:
> Provide support for this insn, which is a prereq to FRED. CPUID-wise,
> while its and FRED's enumerators were already introduced, their dependency
> still needs adding.
>
> While adding a testcase, also add a SWAPGS one. In order to not affect
> the behavior of pre-existing tests, install write_{segment,msr} hooks
> only transiently.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> For PV save_segments() would need adjustment,

Not really.  CPL3 must never have a way of modifying GS_KERN, hence ...

> but the insn being restricted to ring 0 means PV guests can't use it anyway

... the CPL0 restriction.

Arguably I should have had this in one of the FRED patches:

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 1151997758c6..3364e774ada7 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1952,7 +1952,7 @@ static void load_segments(struct vcpu *n)
  * changes to bases can also be made with the WR{FS,GS}BASE instructions, when
  * enabled.
  *
- * Guests however cannot use SWAPGS, so there is no mechanism to modify the
+ * Guests cannot use SWAPGS or LKGS, so there is no mechanism to modify the
  * inactive GS base behind Xen's back.  Therefore, Xen's copy of the inactive
  * GS base is still accurate, and doesn't need reading back from hardware.
  *


but I don't think it's appropriate to merge into this patch.

> (unless we wanted to emulate it as another privileged insn).

We already have "LKGS" in hypercall form.  It's spelt
SEGBASE_GS_USER_SEL and has existed for 20 years or so.

I don't see any reason to extend emul_priv_op().

>
> I've also dropped the test harness read_segment() change. It generally
> would be correct to have, but isn't needed anymore with neither SWAPGS
> nor LKGS handling using the hook.

Dropping read_segment() makes your patch depend on Teddy's, now that
test_x86_emulator is blocking in CI.

This matters for backports.  I expect I'll be backporting guest support
in not-too-long.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -2899,8 +2899,37 @@ x86_emulate(
>                  break;
>              }
>              break;
> -        default:
> -            generate_exception_if(true, X86_EXC_UD);
> +
> +        case 6: /* lkgs */
> +            generate_exception_if((modrm_reg & 1) || vex.pfx != vex_f2,
> +                                  X86_EXC_UD);
> +            generate_exception_if(!mode_64bit() || !mode_ring0(), 
> X86_EXC_UD);
> +            vcpu_must_have(lkgs);
> +            fail_if(!ops->read_msr || !ops->write_segment || 
> !ops->write_msr);
> +            if ( (rc = ops->read_msr(MSR_SHADOW_GS_BASE, &msr_val,
> +                                     ctxt)) != X86EMUL_OKAY ||
> +                 (rc = ops->read_msr(MSR_GS_BASE, &sreg.base,
> +                                         ctxt)) != X86EMUL_OKAY )
> +                goto done;
> +            dst.orig_val = sreg.base; /* Preserve full GS Base. */

"Preserve current GS Base."

> +            if ( (rc = protmode_load_seg(x86_seg_gs, src.val, false, &sreg,
> +                                         ctxt, ops)) != X86EMUL_OKAY )
> +                goto done;
> +            /* Write (32-bit) base into SHADOW_GS. */

"Write new base into SHADOW_GS.  Zero extended from GDT/LDT."

> +            if ( (rc = ops->write_msr(MSR_SHADOW_GS_BASE, sreg.base,
> +                                      ctxt, false)) != X86EMUL_OKAY ||
> +                 (sreg.base = dst.orig_val, /* Reinstate full GS Base. */

"Reinstate original GS base."

> +                  (rc = ops->write_segment(x86_seg_gs, &sreg,
> +                                           ctxt)) != X86EMUL_OKAY) )
> +            {
> +                /*
> +                 * In real hardware, access to the registers cannot fail.  It
> +                 * is an error in Xen if the writes fail.
> +                 */
> +                ASSERT_UNREACHABLE();
> +                x86_emul_reset_event(ctxt);
> +                generate_exception(X86_EXC_DF, 0);
> +            }
>              break;
>          }
>          break;

This patch needs one more hunk:

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 5273fe0ae435..9d8195e2ae56 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -765,14 +765,25 @@ static void __init calculate_hvm_max_policy(void)
      */
     __set_bit(X86_FEATURE_NO_LMSL, fs);
 
-    /*
-     * On AMD, PV guests are entirely unable to use SYSENTER as Xen runs in
-     * long mode (and init_amd() has cleared it out of host
capabilities), but
-     * HVM guests are able if running in protected mode.
-     */
-    if ( (boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) &&
-         raw_cpu_policy.basic.sep )
-        __set_bit(X86_FEATURE_SEP, fs);
+    if ( boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
+    {
+        /*
+         * On AMD, PV guests are unable to use SYSENTER as Xen runs in long
+         * mode (and init_amd() has cleared it out of host
capabilities), but
+         * HVM guests are able if running in protected mode.
+         */
+        if ( raw_cpu_policy.basic.sep )
+            __set_bit(X86_FEATURE_SEP, fs);
+
+        /*
+         * NullSelectorClearsBase is really a "hardware doesn't have
this bug
+         * any more" bit.  All FRED-capable hardware has NSCB
properties, so
+         * disallow a configuration which suggest/causes behaviour the
OS isn't
+         * expecting.
+         */
+        if ( !test_bit(X86_FEATURE_NSCB, fs) )
+            __clear_bit(X86_FEATURE_LKGS, fs);
+    }
 
     /*
      * VIRT_SSBD is exposed in the default policy as a result of


because otherwise a CPU Policy could hide NCSB and LKGS would be have
correctly when executed normally but malfunction in the emulator.

This hunk is in lieu of having vendor-dependent deep-deps calculations,
although it would need duplicating in userspace too.

Because this is only a link between an AMD-only feature and a common
feature, I think I can express it by only having a per-vendor
deep_features bitmap and keeping a shared deep_deps matrix.

Perhaps I should prototype that instead, but it would become another
dependency for this patch.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.