[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] x86/svm: Keep the RAS balanced for guests


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Date: Fri, 12 Aug 2022 10:04:45 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9PX1POnGB790/9++cgaZ2LtwGneW0aZMCffnbsD+9gw=; b=TbAWan7pf7e5GiSm2wtj5edAJlzYwB13fA0Up2vFhBr/sPvyHMLl+zSPRtKl5rSVQPN2N5NEQuC3J4sNkgTId4S0T3RiGllNxQFGjhgT2iZUIr/7Omv99pOoKkUKz+9ifBF2QJA6vG2gofgie+wKMBvTcSCGNCTOkuKhVfTvPP/JCr4rw3wD3MYYRm1KUvCpXjC4T1kBio9tcyHdGbyRlgBiSXxUxKb8LgJYHpIAQCvjYLyJnEYf9Xk2t/weUMuOb4rcJXoSCb4t3axDvd7jT/TO6GqgSn9rZp/db4NSWoh0WR2OP6fgLPnsfTyw04sk0iJe90J9B3SPH6JE/3Hv4g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cVYJ5bSBi+X2O8T7jsndeO5sGxRkUhOfQEr72+hs2Hewqxi5HyYp2VojXQDWvSgVpDkN7ZbDybYQqFl3i8/QARh69ZnsAeeTylPgbybsgJKCRgBO/iAtDgIcFH5HdUB2N9dArHivb/dM6XOHxA0ANBjfvwwvG696KbRLOjrNXS4LhVJ/GqcKjbjxVyyLWb7qBQwaRnRhQQPHYXaINqFyho0esQOZ2F+zMDwX0F8PHF2KxOaLGxwuH6aNcKZczPCVTmRfOpE6vhNVtDivuWX7QozI7Dd1mjh5Pgqf6JAMUQmg9hl6oZ7p1exk/LO6V22k0NSUjSfzg0OlhcsUlgJQWw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 12 Aug 2022 10:05:01 +0000
  • Ironport-data: A9a23:+N7ibKML9zbkBpzvrR29lsFynXyQoLVcMsEvi/4bfWQNrUoigmACm mVOXj+DM6yMamOnf9gjaojj/U0B68LUn9dkGgto+SlhQUwRpJueD7x1DKtR0wB+jCHnZBg6h ynLQoCYdKjYdleF+lH3dOCJQUBUjcmgXqD7BPPPJhd/TAplTDZJoR94kqsyj5UAbeKRWmthg vuv5ZyEULOZ82QsaDhMu/vZ8EoHUMna41v0gHRvPZing3eG/5UlJMp3Db28KXL+Xr5VEoaSL woU5Ojklo9x105F5uKNyt4XQGVTKlLhFVHmZk5tc7qjmnB/Shkaic7XAha+hXB/0F1ll/gpo DlEWAfZpQ0BZsUgk8xFO/VU/r0X0QSrN9YrLFDm2fF/wXEqfFOx4s9RAWNqercCpN19RkJgz sEnMDYkO0Xra+KemNpXS8FKr+F6dozHGd1avXttizbEEfwhXJbPBb3Q4sNV1ysxgcYIGuvCY 80eanxkaxGojx9nYw9LTs5h2rr5wCChI1W0q3rMzUYzy0HVwBZ8z/7GN93Nd8bRbc5UglyZt iTN+GGR7hQya4POlWTZoijEaunnkiLbSplMP4OC09Uxo3+0nE0+DRcZfA7uyRW+ogvkMz5FE GQW8Cczqak59GSwU8LwGRa/pRasoRo0S9dWVeog52ml2qfSpgqUGGUAZjpAc8A98t87QyQw0 V2ElM+vAiZg2IB5UlqY/7aQ6D+3Zy4cKDZYYTdeFFVVpd7+vIs0kxTDCM55F7K4hcH0Hje2x C2WqC85hPMYistjO7iHwG0rSgmE/vDhJjPZLC2OBApJMisRiFaZWrGV
  • Ironport-hdrordr: A9a23:LqPa1K0ouw/0o0UyxtUWfAqjBZpxeYIsimQD101hICG9Lfb0qy n+pp4mPEHP4wr5AEtQ4uxpOMG7MBDhHQYc2/hdAV7QZnidhILOFvAv0WKC+UyrJ8SazIJgPM hbAs9D4bHLbGSSyPyKmDVQcOxQj+VvkprY49s2pk0FJW4FV0gj1XYBNu/xKDwVeOAyP+tcKH Pq3Lsjm9PPQxQqR/X+IkNAc/nIptXNmp6jSRkaByQ/4A3LoSK05KX8Gx242A5bdz9U278t/U XMjgS8v8yYwrCG4y6Z81WWw4VdmdPnxNcGLMuQivINIjGpphe0aJ9nU7iiuilwhO208l4lnP TFvh9lFcVu7HH6eH2zvHLWqkfd+Qdrz0Wn5U6TgHPlr8C8bik9EdB9iYVQdQacw1Y8vflnuZ g7nF6xht5yN1ftjS7979/HW1VBjUyvu0cvluYVkjh2TZYeUrlMtoYSlXklUqvoXRiKrbzPIt MeS/0018wmN29yqEqp51WH9ebcGkjb2C32GnTq9PbliAS+10oJsnfwjPZv4kvosqhNC6Wsrt 60TJiB3tt1P7ArRLM4C+EbTcStDGvRBRrKLWKJOFziULoKInTXtvfMkfwIDcyRCes1JaEJ6e L8eUIdsXR3d1PlCMWI0pEO+hfRQH+lVTCozs1F/ZB2trD1WbKuaES4ORsTutrlp+9aDtzQWv 61Np4TC/j/LXH2EYIM2wHlQZFdJXQXTcVQsNcmXFCFpN7NN+TRx6TmWeeWIKCoHScvW2v5DH dGVD/vJN9Y5kTuQXP8iAi5YQKYRqU+x+MELEH3xZli9GFWDPw8juE8syXL2uibbTtfr6cxYE xyZLv6j6LTnxjFwVr1
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYrbzfrT+bJPIlQEuwk4UCdkz8u62q3u4AgAArfAA=
  • Thread-topic: [PATCH 2/2] x86/svm: Keep the RAS balanced for guests

On 12/08/2022 08:29, Jan Beulich wrote:
> On 11.08.2022 21:59, Andrew Cooper wrote:
>> One source of lost performance was that fact that to protect Xen from a
>> malicious guests, we had to flush the RAS.
>>
>> It turns out that CET Shadow Stacks give us enough architectural guarantees 
>> to
>> construct a lower overhead mitigation, which keeps the RAS balanced for the
>> guest so their return performance is still good.
>>
>> To keep the RAS balanced, Xen must execute the same number of CALLs as RETs
>> across one VMexit->VMEntry.  Without CET-SS, we could achieve this fairly
>> easily with a `call; add $8, %rsp` and `push; ret` pair, but this is not 
>> legal
>> under CET-SS.  In fact, CALL is the only shadow stack "push" operation we
>> have, and we can't use it a second time if we intend to keep the RAS 
>> balanced.
>>
>> Instead, we keep a real return address on the stack.  This means that for 
>> some
>> of entry.S, %rsp conditionally doesn't reference CPUINFO.
>>
>> This necessitates swapping the current order of DO_OVERWRITE_RSB and
>> svm_vmexit_spec_ctrl; while they don't have any specific ordering
>> requirements, push_one_ras needs to come after svm_vmexit_spec_ctrl or else 
>> we
>> need some very invasive changes to fix up the %rsp changes.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> CC: Wei Liu <wl@xxxxxxx>
>>
>> RFC for a couple of reasons.  This does function correctly, but I still want
>> to do more perf testing.
> As per further down you mean to say it functions correctly without the
> use of alternatives. And even then (see below) I suppose it doesn't
> function correctly with no (or unused) CET-SS but CONFIG_XEN_SHSTK=y.

I came to realise that after sending that, until we get nested
alternatives, I can't make it function correctly for the
conditionally-not-CET-SS case without doing the full mov/rdsspq/cmp
sequence.

While that's possible, I'm not inclined to add the size overhead for a
case I'm not sure is safe.

So, the alternative to turn this on is going to strictly depend on
CONFIG_XEN_SHSTK.

At some point when nested alternatives appear, we can relax the
requirement, if there is a desperate wish for the perf improvement in
uncertain safety conditions.

>
>> Secondly, X86_FEATURE_ALWAYS is clearly not ok for committing.  I'm still
>> debating whether to make this construct available in !CET-SS cases.
>> Mechanically, its fine, but the safety arguments depend on CET-SS being
>> active.
> I'm afraid it's not entirely clear what you mean here, nor why you've used
> X86_FEATURE_ALWAYS in the first place when we have X86_FEATURE_XEN_SHSTK.

It can't depend on X86_FEATURE_XEN_SHSTK because, like the PBRSB case on
Intel, I want a way for the user to switch back to stuff32 if it
subsequently turns out that I've screwed up in the safety reasoning.

> If "this construct" is push_one_ras, then the mere use of WRSSQ requires
> it to not be used based on a runtime characteristic, not just a build
> time one. Hence afaict you could as well put the entire macro body in the
> #ifdef that currently encloses only the CET-SS insns.

What I mean is that "this" is faster than stuff32, and might be
"acceptably safe" for someone either on BTC-vunerable hardware, or newer
hardware but with CET-SS explicitly turned off.

That said...

>> In principle, on CPUs which do not suffer Branch Type Confusion, you might be
>> able to reason a defence-in-depth argument that if an attacker can't control
>> indirect speculation, then they can't bypass the 1-stuff safety either, but
>> the only AMD CPUs not vulnerable to BTC have CET-SS anyway.
> Yet people may have reasons to turn off its use.

... I view this as a perf improvement only.

As such, I'm entirely happy to say that it's only available in the case
which is easy to do, ought to be the common case, and is the case which
I'm confident is safe.

If people want it in other cases too, they can see about helping nested
alternatives along...

>> Third, I'd like some early feedback on how clear it the logic is given the
>> conditional nature of %rsp not referencing CPUINFO.
> It's assembly code, touching of which needs extra care. Taking together
> the size of the entire file (quite small) and the comments you add, I'd
> say that's fine.

That's good.  It turned out to be far less invasive than I feared.

The major observation which allowed for this was that it doesn't matter
trying to keep the RAS balanced immediately after schedule, so we don't
need to play games with reset_stack_and_jump() to try and execute one
fewer CALL.

>> Fourth, the alternatives logic (I think) needs improving to not fix up a
>> direct CALL/JMP displacement if the destination is within the replacement
>> length.  I did the functional testing before wrapping things in alternatives.
> Yes, unless you want to prefix the CALL with a redundant insn prefix to
> hide it from the displacement adjustment logic.

Hmm, that should work, but is ugly.

Honestly, I think I'd prefer to take the take the time to start doing a
boot time self-test for alternatives, as pert the SMC testing
improvements, because we desperately need this for so many other reasons
too.

~Andrew

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.