[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/amd: do not expose HWCR.TscFreqSel to guests


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 13 Sep 2023 09:50:07 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CaYP/gNg41s9v+sQwz5n6E4SqlW9NJMpl2eVkkx0cM4=; b=bz46ga0wrl/UgzJgVSPrk0wIIrrIx3qhEXNsqHobnWQNgLit7ytE14Ko5/IhPwi48V9fgHDRyJGXVpGLHzSf2FlSvQLbMpNU0AvbbAPvenEQRhImTybyQoPR/XmGH2K+ZYuhkBGdHnPeoJ/Ft6KST+ROMAi0gG1huSYGcpETjrZ/3g3uSVz3U1Q+eH5luFgcGUD79wTUxL31Jp6dpAiKxhEgZK/D1RvyDXWP1hNXQNA5iL5nj9TA73wyrPMhxtcv5Gsi4j6SXcULpGrcnFMupcv6xDFyK7ruseCLd0qijDXpNRAM14v+Pou3MsQYVjsfe6bLtCwVpGrwDKcK4TciSQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nWeKi/mbdx7MkGxxxPP5aWgpOFJ424mnUpaQMLyCgmGHsw6756V7iMoUx2imJoGHUkVsb535m526gSOk0XS1MIzZmXj2UWz8y2Yn4oo2ko9TPr5rqrFnkUzF5LH2ssLl+rVIchqGryUr/8S4CsJUgYt8wYSIkdPwM8Qao17JMIjkaYNuRYyeCAu4yU+ckOw+rYi06xefPdJW/evHm50rYGBZVgMLfrcqMDbOCchfokV7mKNsy28gWvkIA3czzbf6REHUloM4Sqw0V91ckx5vCQ6roVumlWY2Ec0oGq5l+JHi4gid2/jFu8C+h15uDhj70XSN6q6fFIYpWwJ4B3T8YA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, solene@xxxxxxxxxxx, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Demi Marie Obenour <demi@xxxxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 13 Sep 2023 07:50:38 +0000
  • Ironport-data: A9a23:yWvCvKw8CLkt42BDpzZ6t+fGxyrEfRIJ4+MujC+fZmUNrF6WrkVTy GcYWz2GOayNZDHxfdFyaIu38U1Tu5TVz9dmHgVv+CAxQypGp/SeCIXCJC8cHc8wwu7rFxs7s ppEOrEsCOhuExcwcz/0auCJQUFUjPzOHvykTrecZkidfCc8IA85kxVvhuUltYBhhNm9Emult Mj75sbSIzdJ4RYtWo4vw/zF8EkHUMja4mtC5QRvPKsT5TcyqlFOZH4hDfDpR5fHatE88t6SH 47r0Ly/92XFyBYhYvvNfmHTKxBirhb6ZGBiu1IOM0SQqkEqSh8ai87XAME0e0ZP4whlqvgqo Dl7WT5cfi9yVkHEsLx1vxC1iEiSN4UekFPMCSDXXcB+UyQq2pYjqhljJBheAGEWxgp4KTpV0 f46b2kWVBuGhbro/5eqFrNKgdt2eaEHPKtH0p1h5RfwKK98BLX8GeDN79Ie2yosjMdTG/qYf 9AedTdkcBXHZVtIJ0sTD5U92uyvgxETcRUB8A7T+fVxvTaVkFcZPLvFabI5fvSQQspYhACAr 3/u9GXlGBAKcteYzFJp91r13LWexXqkANl6+LuQxOc2jnu8zWUvCkcJBWn8gdWF0WKzYocKQ 6AT0m90xUQoz2S7Q9+4UxCmrXqsuh8HR8EWA+A88BuKyKff/0CeHGdsZjxLZcEitcQ2bSc3z VLPlNTsbRRwtJWFRHTb8a2bxRupPiwYK2IqYjcJSwEe75/kuo5bphfGVNNqCqO2ptzzBzDrw jqOoTQ+hrMclsoC3eOw+lWvqz6ho5nhTwgr5x7WVGao8gN4YoG+Y4Wir1Pc6J59wJ2xS1CAu D0BhJKY5eVXV5WVznTRG6MKAa2j4OuDPHvEm1lzEpI99jOrvXm+YYRX5zI4L0BsWioZRQLUj IbokVs5zPdu0LGCNMebv6rZ5xwW8JXd
  • Ironport-hdrordr: A9a23:2WCHsq4VNfHaGGHVpgPXwD7XdLJyesId70hD6qkQc3FomwKj9/ xG/c5rsyMc7Qx6ZJhOo7+90cW7L080sKQFg7X5Xo3SOzUO2lHYT72KhLGKq1Hd8m/Fh4tgPM 9bGJSWY+eAaWSS4/ya3OG5eexQv+Vu8sqT9JnjJ6EGd3AaV0lihT0JejpyCidNNXB77QJSLu vg2iJAzQDQAUg/X4CAKVQuefPMnNHPnIKOW297O/Z2gDP+9g9B8dTBYmKl4is=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Sep 12, 2023 at 05:35:15PM +0100, Andrew Cooper wrote:
> On 12/09/2023 5:23 pm, Roger Pau Monne wrote:
> > OpenBSD will attempt to unconditionally access PSTATE0 if HWCR.TscFreqSel is
> > set, and will also attempt to unconditionally access HWCR if the TSC is
> > reported as Invariant.
> >
> > The reasoning for exposing HWCR.TscFreqSel was to avoid Linux from printing 
> > a
> > (bogus) warning message, but doing so at the cost of OpenBSD not booting is 
> > not
> > a suitable solution.
> >
> > In order to fix expose an empty HWCR.
> 
> At first I was thinking a straight up revert, but AMD's CPUID Faulting
> is an architectural bit in here so it's worth keeping the register around.

A straight up revert won't work, because (as you notice below)
HWCR is architectural, so accesses must not fault.

> >
> > Fixes: 14b95b3b8546 ('x86/AMD: expose HWCR.TscFreqSel to guests')
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > ---
> > Not sure whether we want to expose something when is_cpufreq_controller() is
> > true, seeing as there's a special wrmsr handler for the same MSR in that 
> > case.
> > Likely should be done for PV only, but also likely quite bogus.
> >
> > Missing reported by as the issue came from the QubesOS tracker.
> 
> Well - we can at least have a:
> 
> Link: https://github.com/QubesOS/qubes-issues/issues/8502

Sure.

> in the commit message, and it's probably worth asking Solène / Marek
> (both CC'd) if they want a Reported-by tag.

I'm happy to add a Reported-by tag, just didn't have an email to use.

> > ---
> >  xen/arch/x86/msr.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> > index 3f0450259cdf..964d500ff8a1 100644
> > --- a/xen/arch/x86/msr.c
> > +++ b/xen/arch/x86/msr.c
> > @@ -240,8 +240,12 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t 
> > *val)
> >      case MSR_K8_HWCR:
> >          if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> >              goto gp_fault;
> > -        *val = get_cpu_family(cp->basic.raw_fms, NULL, NULL) >= 0x10
> > -               ? K8_HWCR_TSC_FREQ_SEL : 0;
> > +        /*
> > +         * OpenBSD 7.3 accesses HWCR unconditionally if the TSC is 
> > reported as
> > +         * Invariant.  Do not set TSC_FREQ_SEL as that would trigger 
> > OpenBSD to
> > +         * also poke at PSTATE0.
> > +         */
> 
> While this is true, the justification for removing this is because
> TSC_FREQ_SEL is a model specific bit, not an architectural bit in HWCR.
> 
> Also because it's addition without writing into the migration stream was
> bogus irrespective of the specifics of the bit.
> 
> I'm still of the opinion that it's buggy for OpenBSD to be looking at
> model specific bits when virtualised,

Well, I think we can argue that an OS is free to ignore the CPUID HV
bit and still boot on Xen (even if that leads to non-ideal decisions).

> but given my latest reading of the
> AMD manuals, I think OpenBSD *is* well behaved looking at PSTATE0 if it
> can see TSC_FREQ_SEL.

Hm, there's no written down note that TSC_FREQ_SEL implies PSTATE0 to
be available (and PSTATE0 is not an architectural MSR), but I can see
how a guest can expect to fetch the P0 frequency if it sees
TSC_FREQ_SEL.  It might have been more fail safe to check for
PSTATE_LIMIT not faulting before attempting to access PSTATE0.

> In some theoretical future where the toolstack better understands MSRs
> and (non)migratable VMs (which is the QubesOS usecase), then it would in
> principle be fine to construct a VM which can see the host TSC_FREQ_SEL
> and PSTATE* values.
> 
> Preferably with an adjusted comment, Reviewed-by: Andrew Cooper
> <andrew.cooper3@xxxxxxxxxx>

Thanks, will reply to other comments before taking the RB and
resending.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.