[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 6/6] x86/boot: Expose MSR_ARCH_CAPS data in guest max policies


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Tue, 16 May 2023 14:51:02 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=10NddWnpMNRWtTxlaOTJyEk/AzGlbQuZ0DkZ/9+OPQU=; b=h3Ghvj9dTqWKufF955a/8MkacTr3hBKpnBg+pjRmc15EoTj8cYUq/ti/RFAgDw3hsQri0AbX1ljGL+Z9qJOIyqO3UgSWG+J2+/2sjLoOjWyzWfjP8lwLrC20Cu26h73R2e9WnLGyDy9Elm6ftFm5E0qQaMXqr9FOr4DOjmEby9KAKQkvsCs0Lg3Q/S+QOYafdD2iiGErB4XBZAVFriozESDzChHOOQdtbaBL407V0LMSUbC/vrAC06wfKnxIZJ8Sg25TLfAU+id0Ey0wWRrSVjNPOKw38b6xBJEbpYF5uXUdKELuSmuCk+pZ1YcqiltXNNff73k+HUxpae9DJi7e2w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XIMsdsxsrl0llNVIdVBu0p3dNV7VCyXvYbNnTS/pD1BX+B/ZryKZt6CmLoGeYOZaFwe8mGWi1eJbUjmgwScEp2IkRRRDHLzfKumd+MtHilMkao+mmHarBLXqMrynoQxjDwSc8sXnDxl6/Vp9EXMjNY0DsEntiS6gp+DbEF6+Hf0BEHQ7t8LYWfJdgzl0/OA3IYn9aDHHlSFVPo9mwxczbxFbEDjIXLHSQIp81sPdjXWYgDpdcZRaT88ACA6U3vmCvLHLx+OoRcT9rMBPr+1pC0HiOWByfuYNdJ52gveiLgGNcKhWo1kfvDByKTOEUGqZ2B+VHvdl4A2lE4Kp+wa6lg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 16 May 2023 13:51:37 +0000
  • Ironport-data: A9a23:ti3H36P2vnBVtLbvrR2IlsFynXyQoLVcMsEvi/4bfWQNrUojgTIGx msaWGzQPKvcZWSgeY9+Oou/9UsFscTcx9BhGwto+SlhQUwRpJueD7x1DKtS0wC6dZSfER09v 63yTvGacajYm1eF/k/F3oDJ9CU6jufQAOKnUoYoAwgpLSd8UiAtlBl/rOAwh49skLCRDhiE/ Nj/uKUzAnf8s9JPGj9SuvPrRC9H5qyo42tF5wJmP5ingXeF/5UrJMNHTU2OByOQrrl8RoaSW +vFxbelyWLVlz9F5gSNy+uTnuUiG9Y+DCDW4pZkc/HKbitq/0Te5p0TJvsEAXq7vh3S9zxHJ HehgrTrIeshFvWkdO3wyHC0GQkmVUFN0OevzXRSLaV/ZqAJGpfh66wGMa04AWEX0s9bPkBUy KYjEyEiSAGlq9ntxe68bOY506zPLOGzVG8ekldJ6GiBSNoDH9XESaiM4sJE1jAtgMwIBezZe 8cSdTtoalLHfgFLPVAUTpk5mY9EhFGmK2Ee9A3T+PpxujCLpOBy+OGF3N79U9qGX8hK2G2fo XrL5T/RCRAGLt2PjzGC9xpAg8eWxHqlB9JOT+bQGvhCmXi26WEJU0IqaVr4mOaSqkSTSftGN BlBksYphe1onKCxdfH/VRClpH+PvjYHRsFdVeY97Wml1a788wufQG8eQVZpeNEg8cM7WzEu/ luIhM/yQyxitqWPTnCQ/avSqim9URX5NkcHbC4ACAcAvd/qpdhrigqVF447VqmoktfyBDf8h SiQqzQzjKkSishN0Lin+VfAgHSnoZ2hohMJ2zg7l1mNtmtRDLNJraTygbQHxZ6s9Lqkc2Q=
  • Ironport-hdrordr: A9a23:WmwdDK1mcavSOooVClF7dAqjBHYkLtp133Aq2lEZdPU0SKGlfq GV7ZEmPHrP4gr5N0tOpTntAse9qBDnhPxICOsqXYtKNTOO0AeVxelZhrcKqAeQeBEWmNQ96U 9hGZIOcuEZDzJB/LvHCN/TKadd/DGFmprY+ts31x1WPGVXgzkL1XYANu6ceHcGIzVuNN4CO7 e3wNFInDakcWR/VLXBOpFUN9KzweEijfjdEGc7OyI=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 16/05/2023 2:06 pm, Jan Beulich wrote:
> On 15.05.2023 16:42, Andrew Cooper wrote:
>> --- a/xen/arch/x86/cpu-policy.c
>> +++ b/xen/arch/x86/cpu-policy.c
>> @@ -408,6 +408,25 @@ static void __init calculate_host_policy(void)
>>      p->platform_info.cpuid_faulting = cpu_has_cpuid_faulting;
>>  }
>>  
>> +static void __init guest_common_max_feature_adjustments(uint32_t *fs)
>> +{
>> +    if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
>> +    {
>> +        /*
>> +         * MSR_ARCH_CAPS is just feature data, and we can offer it to guests
>> +         * unconditionally, although limit it to Intel systems as it is 
>> highly
>> +         * uarch-specific.
>> +         *
>> +         * In particular, the RSBA and RRSBA bits mean "you might migrate 
>> to a
>> +         * system where RSB underflow uses alternative predictors (a.k.a
>> +         * Retpoline not safe)", so these need to be visible to a guest in 
>> all
>> +         * cases, even when it's only some other server in the pool which
>> +         * suffers the identified behaviour.
>> +         */
>> +        __set_bit(X86_FEATURE_ARCH_CAPS, fs);
>> +    }
>> +}
> The comment reads as if it wasn't applying to "max" only, but rather to
> "default". Reading this I'm therefore now (and perhaps even more so in
> the future, when coming across it) wondering whether it's misplaced, or
> and hence whether the commented code is also misplaced and/or wrong.

On migrate-in, we (well - toolstacks that understand multiple hosts)
check the cpu policy the VM saw against the appropriate PV/HVM max
policy to determine whether it can safely run.

So this is very intentionally for the max policy.  We need (I think -
still pending an clarification from Intel because there's pending work
still not published) to set RSBA unconditionally, and RRSBA conditional
on EIBRS being available, in max even on pre-Skylake hardware such that
we can migrate-in a VM which previously ran on Skylake or later hardware.

Activating this by default for VMs is just a case of swapping the CPUID
ARCH_CAPS bit from 'a' to 'A', without any adjustment to this logic.

> Further is even just non-default exposure of all the various bits okay
> to other than Dom0? IOW is there indeed no further adjustment necessary
> to guest_rdmsr()?
>
>> @@ -828,7 +845,10 @@ void __init init_dom0_cpuid_policy(struct domain *d)
>>       * domain policy logic gains a better understanding of MSRs.
>>       */
>>      if ( is_hardware_domain(d) && cpu_has_arch_caps )
>> +    {
>>          p->feat.arch_caps = true;
>> +        p->arch_caps.raw = host_cpu_policy.arch_caps.raw;
>> +    }
> Doesn't this expose all the bits, irrespective of their exposure
> annotations in the public header?

No, because of ...

>  I.e. even more than just the two
> bits that become 'A' in patch 4, but weren't ...
>
>> @@ -858,20 +878,6 @@ void __init init_dom0_cpuid_policy(struct domain *d)
>>          p->platform_info.cpuid_faulting = false;
>>  
>>      recalculate_cpuid_policy(d);

... this recalculate_cpuid_policy() (which was moved in patch 1), which
applies the appropriate pv/hvm max mask over the inherited bits.


More generally, this is how *all* opting-into-non-default features needs
to work when it's more than just turning on a single feature bit.  It's
also why doing full-policy levelling in the toolstack is much harder
than it appears on paper.

All domains get the default policy, so zero out all non-default
information.  It has to be recovered from somewhere.  Generally that
would be the appropriate max policy, but the host policy here is fine
because there's nothing to do other than applying the appropriate max mask.

When arch-caps becomes default, the full block feeding arch caps back
into dom0 will be dropped, but there's still a lot of work to do first.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.