[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 4/4] x86/cpu-policy: Derive RSBA/RRSBA for guest policies


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 16 Jun 2023 14:18:34 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wFYFyFViGCwaP3dT/4/Ri8jfi/Lu/yIX3HIVH9bAINI=; b=eLZ7wF0e+i1cUM0TZ8cCkyvTPTq9L/qWcycKmwG65pp1miTSFP2YN8ZMfwTa9ySQOS7ui/P7sr46/7qVXA47pyhk2hAX1bNEvYv6WZYJDpFw4AWrMgk1UgFg6Z+j4ZZXeH+0Q6TIncD5s6Z6Hd4j2XlcC4oWmykM+AqqNTKJ0i8fcDXrYphe+hUSMlHcGnnHjvJKsa/lMX/RK/qGB3MaOk7fDhHfEWm2zTjH1XTrR6mUEbcAwXl+8qlLQbdgilykLy9f1G8w96Nfoue/TgOVhvtkC+2uJOdC7uHBZ/r/Nzv0KWUruwyqEKE03PFmuRSF5c8LsN8d7eUT2rI+fyax2g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P4BAmnTQaxCB8VDX/4+HImxiqHMOTP5wjm48dBfB2gE8oauR8m51M6EMo02B3Z1N4RIU/SBY9zgFyYGgko2+ERXnuEFSXlm+gD0pgORE137JJEczaNbRlBg7vFnPGxl/Z0Ao8kdPqEKyQHL1lk5AeiauDWKHIPlNBIRijMJtOB9UKNQYZa7nil8S/cNdWtcqP8yIWXciFYsb6/KveGrK3/c+47ISiEGKrpgA8sqgZNhoytk4Ky73Hl9ZMH/VCPGP6jujHBbzaQmqAeH9ql+rBMl0AknCOnhXB4gS5BLyqXQgOH3m0OBRLfIJeKFqLfGCAtwDSYYeKzUanCiTJFzqEQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 16 Jun 2023 13:19:03 +0000
  • Ironport-data: A9a23:EoxT7aIjfcFYOkfCFE+R/JQlxSXFcZb7ZxGr2PjKsXjdYENShGYEz zcWCz2OPf/fM2rweY8gaNni808Dv5SAyd5rTABlqX01Q3x08seUXt7xwmUcnc+xBpaaEB84t ZV2hv3odp1coqr0/0/1WlTZhSAgk/rOHvykU7Ss1hlZHWdMUD0mhQ9oh9k3i4tphcnRKw6Ws Jb5rta31GWNglaYCUpKrfrbwP9TlK6q4mhA4AVgPaojUGL2zBH5MrpOfcldEFOgKmVkNrbSb /rOyri/4lTY838FYj9yuu+mGqGiaue60Tmm0hK6aYD76vRxjnVaPpIAHOgdcS9qZwChxLid/ jnvWauYEm/FNoWU8AgUvoIx/ytWZcWq85efSZSzXFD6I+QrvBIAzt03ZHzaM7H09c5GKkBv2 NEWBAtXLT6qn+fp8Iqpe8Bj05FLwMnDZOvzu1lG5BSAVLMNZsmGRK/Ho9hFwD03m8ZCW+7EY NYUYiZuaxKGZABTPlAQC9Q1m+LAanvXKmUE7g7K4/dppTGNnWSd05C0WDbRUvWMSd9YgQCzo WXe8n6iKhobKMae2XyO9XfEaurnxHqhCd9ISuLgnhJsqEOB+Dw3DTswbFayjsG8sg2OX9cBK 1NBr0LCqoB3riRHVOLVXRe1vXqFtR40QMdLHqsx7wTl4rXQyxaUAC4DVDEpQMwrsoo6SCIn0 neNnsj1Hnp/vbuNU3Wf+7yI6zSoNkAowXQqYCYFSU4O5IDlqYRq1xbXFI89Qeiyk8H/Hiz2z 3aSti8iir4PjMkNkaKm4VTAhDHqrZ/MJuIo2jjqsquexlsRTOaYi0aAsDA3Md4owF6lc2S8
  • Ironport-hdrordr: A9a23:o0sGIKhnMfeIbnoiS+6WhmEZenBQX0513DAbv31ZSRFFG/FwyP rCoB1L73XJYWgqM03IwerwQJVpQRvnlaKdkrNhRotKPTOW8VdAQ7sSibcKrwePJ8S6zJ8l6U 4CSdk3NDSTNykcsS+S2mDRf7kdKZu8gcaVbIzlvhRQpHRRGsRdBnBCe2Sm+yNNJTVuNN4cLt 6x98BHrz2vdTA8dcKgHEQIWODFupniiI/mSQRuPW9s1CC+yReTrJLqGRmR2RkTFxlVx605zG TDmwvloo2+rvCAzAPG3WO71eUapDKh8KoOOCW/sLlaFtzesHfoWG2nYczDgNkBmpDh1L/tqq iOn/5vBbUw15qbRBDOnfKk4Xic7N9p0Q6o9bbQuwqeneXpAD09EMZPnoRfb1/Q7Fchpsh11O ZR03uerIc/N2K1oM3R3am7a/hRrDvAnVMy1eoIy3BPW4oXb7Fc6YQZ4UNOCZ8FWCb38pouHu ViBNzVoK8+SyLtU1nJ+m10hNC8VHU6GRmLBkAEp8yOyjBT2HR01VERysATlmoJsJg9V55H7e LZNbkArsA4cuYGKaZmQOsRS8q+DWLABRrKLWKJOFziULoKPnrcwqSHk4ndJNvaCKDg4KFC6a gpCmkoylLaU3ied/Gz4A==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 16/06/2023 1:12 pm, Jan Beulich wrote:
> On 15.06.2023 20:17, Andrew Cooper wrote:
>> On 15/06/2023 1:13 pm, Jan Beulich wrote:
>>> On 15.06.2023 12:41, Andrew Cooper wrote:
>>>> On 15/06/2023 9:30 am, Jan Beulich wrote:
>>>>> On 14.06.2023 20:12, Andrew Cooper wrote:
>>>>>> On 13/06/2023 10:59 am, Jan Beulich wrote:
>>>>>>> On 12.06.2023 18:13, Andrew Cooper wrote:
>>>>>>>> The RSBA bit, "RSB Alternative", means that the RSB may use alternative
>>>>>>>> predictors when empty.  From a practical point of view, this mean 
>>>>>>>> "Retpoline
>>>>>>>> not safe".
>>>>>>>>
>>>>>>>> Enhanced IBRS (officially IBRS_ALL in Intel's docs, previously 
>>>>>>>> IBRS_ATT) is a
>>>>>>>> statement that IBRS is implemented in hardware (as opposed to the form
>>>>>>>> retrofitted to existing CPUs in microcode).
>>>>>>>>
>>>>>>>> The RRSBA bit, "Restricted-RSBA", is a combination of RSBA, and the 
>>>>>>>> eIBRS
>>>>>>>> property that predictions are tagged with the mode in which they were 
>>>>>>>> learnt.
>>>>>>>> Therefore, it means "when eIBRS is active, the RSB may fall back to
>>>>>>>> alternative predictors but restricted to the current prediction mode". 
>>>>>>>>  As
>>>>>>>> such, it's stronger statement than RSBA, but still means "Retpoline 
>>>>>>>> not safe".
>>>>>>>>
>>>>>>>> CPUs are not expected to enumerate both RSBA and RRSBA.
>>>>>>>>
>>>>>>>> Add feature dependencies for EIBRS and RRSBA.  While technically 
>>>>>>>> they're not
>>>>>>>> linked, absolutely nothing good can come of letting the guest see RRSBA
>>>>>>>> without EIBRS.  Nor a guest seeing EIBRS without IBRSB.  Furthermore, 
>>>>>>>> we use
>>>>>>>> this dependency to simplify the max derivation logic.
>>>>>>>>
>>>>>>>> The max policies gets RSBA and RRSBA unconditionally set (with the 
>>>>>>>> EIBRS
>>>>>>>> dependency maybe hiding RRSBA).  We can run any VM, even if it has 
>>>>>>>> been told
>>>>>>>> "somewhere you might run, Retpoline isn't safe".
>>>>>>>>
>>>>>>>> The default policies are more complicated.  A guest shouldn't see both 
>>>>>>>> bits,
>>>>>>>> but it needs to see one if the current host suffers from any form of 
>>>>>>>> RSBA, and
>>>>>>>> which bit it needs to see depends on whether eIBRS is visible or not.
>>>>>>>> Therefore, the calculation must be performed after 
>>>>>>>> sanitise_featureset().
>>>>>>>>
>>>>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>>>>>>> ---
>>>>>>>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>>>>>>>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>>>>>>> CC: Wei Liu <wl@xxxxxxx>
>>>>>>>>
>>>>>>>> v3:
>>>>>>>>  * Minor commit message adjustment.
>>>>>>>>  * Drop changes to recalculate_cpuid_policy().  Deferred to a later 
>>>>>>>> series.
>>>>>>> With this dropped, with the title not saying "max/default", and with
>>>>>>> the description also not mentioning "live" policies at all, I don't
>>>>>>> think this patch is self-consistent (meaning in particular: leaving
>>>>>>> aside the fact that there's no way right now to requests e.g. both
>>>>>>> RSBA and RRSBA for a guest; aiui it is possible for Dom0).
>>>>>>>
>>>>>>> As you may imagine I'm also curious why you decided to drop this.
>>>>>> Because when I tried doing levelling in Xapi, I remembered why I did it
>>>>>> the way I did in v1, and why the v2 way was wrong.
>>>>>>
>>>>>> Xen cannot safely edit what the toolstack provides, so must not. 
>>>>> And this is the part I don't understand: Why can't we correct the
>>>>> (EIBRS,RSBA,RRSBA) tuple to a combination that is "legal"? At least
>>>>> as long as ...
>>>>>
>>>>>> Instead, failing the set_policy() call is an option, and is what we want
>>>>>> to do longterm,
>>>>> ... we aren't there.
>>>>>
>>>>>> but also happens to be wrong too in this case. An admin
>>>>>> may know that a VM isn't using retpoline, and may need to migrate it
>>>>>> anyway for a number of reasons, so any safety checks need to be in the
>>>>>> toolstack, and need to be overrideable with something like --force.
>>>>> Possibly leading to an inconsistent policy exposed to a guest? I
>>>>> guess this may be the only option when we can't really resolve an
>>>>> ambiguity, but that isn't the case here, is it?
>>>> Wrong.  Xen does not have any knowledge of other hosts the VM might
>>>> migrate to.
>>>>
>>>> So while Xen can spot problem combinations *on this host*, which way to
>>>> correct the problem combination depends on where the VM might migrate to.
>>> I actually view this as two different levels: With a flawed policy, the
>>> guest is liable to not work correctly at all. No point thinking about
>>> it being able to migrate. With a fixed up policy it may fail to migrate,
>>> but it'll at least work otherwise.
>> If you get RSBA and/or RRSBA wrong, nothing is going to malfunction in
>> the guest, even if you migrate it.
>>
>> The consequence of getting RSBA and/or RRSBA wrong is the guest *might*
>> think retpoline is safe to use, and *might* end up vulnerable to
>> speculative attacks on this or other hardware.
> Isn't that some sort of "malfunction"?

Perhaps, there's a difference between "it will likely crash hard" and
"you won't notice the difference".

>
>> And the admin might know that they overrode the default settings and
>> forced the use of some other protection mechanism, so the guest is in
>> fact safe despite having wrong RSBA/RRSBA settings.
> But then the guest would also be safe with adjusted settings, wouldn't it?

It doesn't mean the guest is going to tolerate having features change
underfoot.

>
>> I don't know how to put it any more plainly.  Xen *does not* have the
>> information necessary to make a safety judgement in this matter.  Only
>> the toolstack (as a proxy for the admin) has the necessary information.
> I'm not looking at it as Xen making things safe by adjusting bogus
> settings. I'm merely looking at it as not letting a guest run that way.
> For the safety aspect I agree it needs a wider view than Xen has.
>
> Anyway, I don't think either of us is going to convince the other of
> there being only one way of looking at things vs there being at least
> two possible ways, so in order to allow things to progress
> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>

Thankyou.

To be clear, we are planning to put checks in place.  We definitely
don't want the admin to end up in this corner case accidentally.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.