[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Intended behavior/usage of SSBD setting


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 24 Oct 2022 12:40:58 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Jh+J6624ZaVZldKM+fCNAYKoBTLef+wSp6DgeWmBK04=; b=DUmGvty1S27iZXOIykAN3k6kmKx2uH4BKvS3SnwqK+HL75hKKAhlPEVMmNQDgI3v65kVJSfCCHiQP7Zd0BL8jl1ol6AIb4N0DkKbjjXgm7LjyVoU61+sC7EY4Z6eP1LWjeVci4jzVm3OQQ9t4LhN1Zz+A95iJcW8ZGzhP3F3qRvX6h6XVdhZrco9PwckroWKD25dsvIKXmzREQy9mzGborCUvqt+XecSTPXCN01W6Dg5LCZaYmtq4SqkCd1C7Xp3nrnTsBQCNyZteksvdF6oUOlprWcN02K280ixdjNCZCKarXlmhwnO7OYU5PdyJ/uZTGqFRM0HxPQ3bV2w6wwo5w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=G3O3udprfWj6SyhVCK5ZuTOiEXCT4qEPAn1CyriG+v9QE7fJk0wmNZS1PiAgZfA4/ZA8wCF/Ifof+zk93wNpfCuKGNKrz7IxQApNvMKK7+yrin1dLTtQ0YLTQqLu7CwmUuIQbGOc/eTb2yKmbI36s/NGzP9jXgLoCJadODEUCaK4ainR5nHQiVKwTnjs60laU2iaTzXu6uchAcLOE6ckbn3+XLvHMZUFPhAGp9esceq3gydwaL1Ek0QhjCl2MwMPClTFO1TtjmF4v5SfO2CbvXe4wrcl+3Bt4Kmp+X2oK2qVBimsMngybBz5K5mtk1OJQ1ycUZ+wPJ/cBQvF9SenEg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Delivery-date: Mon, 24 Oct 2022 10:41:22 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 24.10.2022 11:32, Roger Pau Monné wrote:
> On Mon, Oct 24, 2022 at 08:45:07AM +0200, Jan Beulich wrote:
>> On 21.10.2022 23:54, Andrew Cooper wrote:
>>> On 20/10/2022 12:01, Roger Pau Monné wrote:
>>>> Hello,
>>>>
>>>> As part of some follow up improvements to my VIRT_SPEC_CTRL series we
>>>> have been discussing what the usage of SSBD should be for the
>>>> hypervisor itself.  There's currently a `spec-ctrl=ssbd` option [0],
>>>> that has an out of date description, as now SSBD is always offered to
>>>> guests on AMD hardware, either using SPEC_CTRL or VIRT_SPEC_CTRL.
>>>>
>>>> It has been pointed out by Andrew that toggling SSBD on AMD using
>>>> VIRT_SPEC_CTRL or the non-architectural way (MSR_AMD64_LS_CFG) can
>>>> have a high impact on performance, and hence switching it on every
>>>> guest <-> hypervisor context switch is likely a very high
>>>> performance penalty.
>>>>
>>>> It's been suggested that it could be more appropriate to run Xen with
>>>> the guest SSBD selection on those systems, however that clashes with
>>>> the current intent of the `spec-ctrl=ssbd` option.
>>>>
>>>> I hope I have captured the expressed opinions correctly in the text
>>>> above.
>>>>
>>>> I see two ways to solve this:
>>>>
>>>>  * Keep the current logic for switching SSBD on guest <-> hypervisor
>>>>    context switch, but only use it if `spec-ctrl=ssbd` is set on the
>>>>    command line.
>>>>
>>>>  * Remove the logic for switching SSBD on guest <-> hypervisor context
>>>>    switch, ignore setting of `spec-ctrl=ssbd` on those systems and run
>>>>    hypervisor code with the guest selection of SSBD.
>>>>
>>>> Which has raised me the question of whether there's an use case
>>>> for always running hypervisor code with SSBD enabled, or that's no
>>>> longer relevant if we always offer guests a way for them to toggle the
>>>> setting when required.
>>>>
>>>> I would like to settle on a way forward, so we can get this fixed
>>>> before 4.17.
>>>>
>>>> Thanks, Roger.
>>>>
>>>> [0] 
>>>> https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#spec-ctrl-x86
>>>
>>> There are many issues at play here.  Not least that virt spec ctrl is
>>> technically a leftover task that ought to force a re-issue of XSA-263.
>>>
>>> Accessing MSRs (even reading) is very expensive, typically >1k cycles. 
>>> The core CFG registers are more expensive than most, because they're
>>> intended to be configured once after reset and then left alone.
>>>
>>> Throughout the speculation work, we've seen crippling performance hits
>>> from accessing MSRs in fastpaths.  The fact we're forced to use MSRs in
>>> fastpaths even on new CPUs with built in (rather than retrofitted)
>>> speculation support is is an area of concern still being worked on with
>>> the CPU vendors.
>>>
>>> Case in point.  We found for XSA-398 that toggling AMD's
>>> MSR_SPEC_CTRL.IBRS on the PV entrypath was so bad that setting it
>>> unilaterally behind the back of PV guests was the faster option. 
>>> (Another todo is to stop doing this on Intel eIBRS systems, and this
>>> will recover us a decent chunk of performance.)
>>>
>>>
>>> SSBD mitigations are (rightly or wrongly) off by default for performance
>>> reasons.  AMD are less affected than Intel, for microarchitectural
>>> reasons which are discussed in relevant whitepapers, and which are
>>> expected to remain true for future CPUs.
>>>
>>> When Xen doesn't care about the protecting itself against SSBD by
>>> default, I guarantee you that it will be faster to omit the MSR accesses
>>> and run in the guest kernel's choice, than to clear the SSBD
>>> protection.  We simply don't spend long enough in the hypervisor for the
>>> hit against memory accesses to dwarf the hit for MSR accesses taken on
>>> entry/exit.
>>>
>>> The reason we put in spec-ctrl=ssbd was as a stopgap, because at the
>>> time we didn't know how bad SSB really was, and it was decided that the
>>> admin should have a big hammer to use if they really needed.
>>>
>>> When Xen does care about protecting itself, the above reasoning bites
>>> back hard.  Because we spend (or should be spending!) >99% of time in
>>> the guest, the hit to memory accesses is far more likely to be able
>>> dwarf the hit from the MSR accesses, but now, the dominating factor for
>>> performance is the vmexit rate.
>>>
>>> The problem is that if you've got a completely compute bound workload,
>>> there are very few exits, while if you've got an IO bound workload,
>>> there are plenty of exits.  I honestly don't know if it will be more
>>> efficient to leave SSBD active unilaterally (whether or not we hide
>>> this, e.g. synthesizing SSB_NO), or to let the guest run with it kernels
>>> choice.  I suspect the answer is different with different workloads.
>>>
>>>
>>> But, one other factor helps us.  Given that the default is fast (rather
>>> than secure), anyone opting in to spec-ctrl=ssbd is saying "I care more
>>> about security than performance", at which point we can simplify what we
>>> do because we don't need to cater to everyone.
>>>
>>>
>>> As a slight tangent, there is a cost to having too many options, which
>>> must not be ignored.  Xen's speculation safety is far too complicated
>>> already and needs to get more simple; this has a material impact on how
>>> easy it is to follow, and how easy it to make changes.
>>>
>>> It is the way it is because we've had 6 years of drip feeding one
>>> problem after another, and haven't had the time to take a step and
>>> design something more sensible from having 6 years of
>>> knowledge/learnings as a basis.  There are definitely things which I
>>> would have done differently, if 6 years ago, I'd known what I know now,
>>> and part of the reason why the recent speculation security work has
>>> taken so much effort is because it has involved reworking the effort
>>> which came before, to a deadline which never has enough time to plan
>>> properly within.
>>>
>>>
>>> So, first question, do we care about having an "SSBD active while in
>>> Xen" mode?
>>>
>>> Probably yes, because we a) still don't have a working solution for PV
>>> guests on AMD and b) who knows if there's something far worse lurking in
>>> the future.  Sods law says that if we decide no here, it will be
>>> critical for some future issue.
>>>
>>> But as it's off by default and noone's made has made any noise about
>>> having it on, we ought to prioritise simplicity.
>>>
>>> Given that off is the default, but we know that kernels do offer it to
>>> userspace, and it does get used by certain processes, we need to
>>> prioritise performance.  And here, this is net system performance, not
>>> "ensure it's off whenever it can be".  Having Xen run in the guest
>>> kernel's choice of value will result in much better overall performance,
>>> than trying to modify the setting in the VMentry/exit path.
>>
>> My takeaway from this reply of yours is: By default run with the guest's
>> choice, while (I'm less certain here) you're undecided about the behavior
>> with "spec-ctrl=ssbd". Please could you make explicit whether this is a
>> correct understanding of mine?
> 
>  * spec-ctrl=ssbd -> SSBD always on, expose VIRT_SSBD
>    (VIRT_SPEC_CTRL.SSBD) but guest setting won't be propagated to
>    platform.  As a future improvement also expose SSB_NO in that
>    case.
> 
>  * spec-ctrl=no-ssbd -> Run hypervisor code with guest SSBD selection
>    depending on hardware support.
> 
> Default to `spec-ctrl=no-ssbd`.
> 
> Would that be an accurate?

This matches my view, yes.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.