[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/3] x86/spec-ctrl: Fix up the RSBA/RRSBA bits as appropriate


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 5 Jun 2023 09:48:14 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SiwLSeZKpgf1V+D68QSnxNZGADunS2xVRcTwifBp2x0=; b=giyza+FpqlPM87k9VjN/8ywj1OoZf2kiUQZB/sbiy6XKjm3z7Ly69poS/K9ryPBqTWypU9z01vIxuFxrjigwlxMKqnHXkfha1QYEA/APlsppULRes1e/rze2xZqG66MQOM7FxuchWzoIv6Vd3TVOTAtXsNPepCkyY8t19HQ9TgOeGw8IDf+ynYyVbWiyvs/5Rub6q/MX/1IB6r1xZFFF2QsxMtDNAxMGyOFxcGb+z9aL8Owm3t8ZWKhtTwJz5UpktMXcKGv72eawE2VANuPVyfFq5gDaZi7HPFFW6KC0ZGhYtjgnQ6HsTswLfAMBcaFbE5QKgiXn+mgJS4rLD7hi4w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=I4wJtaPpFrZO+vKSZpRv19ZpCQO7NbM5NPD3qZvqBQrwW+upHPBrnPUsMBB+kdc2Zi/XOftQTsQKDwk+5gNlmzdyo6ueWAqiuCRhBw+5KMGcLp9NSyyWGcaRNsekEXg098veWsVvs24avg+U4l0lD23AoNziqPuE+1X8wk5TkDzZAq3xO2DRUVBPET8Jp5u8NXT/MEcD2DpF0SHjc4iQyB3RjYpC+tgvln3aC61yILpKMCmONJwBVldbKH6wktPpmSSH7uS7dJ/vWo86NLqo4/UTqsH3V1q0WAaiegBAw2InkPi/+f7D7rFdiUBMUxtvT5iz6k4FheL0qi0mNFHQJQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 05 Jun 2023 07:48:23 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 02.06.2023 15:57, Andrew Cooper wrote:
> On 02/06/2023 10:56 am, Jan Beulich wrote:
>> On 01.06.2023 16:48, Andrew Cooper wrote:
>>> @@ -593,15 +596,85 @@ static bool __init retpoline_calculations(void)
>>>          return false;
>>>  
>>>      /*
>>> -     * RSBA may be set by a hypervisor to indicate that we may move to a
>>> -     * processor which isn't retpoline-safe.
>>> +     * The meaning of the RSBA and RRSBA bits have evolved over time.  The
>>> +     * agreed upon meaning at the time of writing (May 2023) is thus:
>>> +     *
>>> +     * - RSBA (RSB Alternative) means that an RSB may fall back to an
>>> +     *   alternative predictor on underflow.  Skylake uarch and later all 
>>> have
>>> +     *   this property.  Broadwell too, when running microcode versions 
>>> prior
>>> +     *   to Jan 2018.
>>> +     *
>>> +     * - All eIBRS-capable processors suffer RSBA, but eIBRS also 
>>> introduces
>>> +     *   tagging of predictions with the mode in which they were learned.  
>>> So
>>> +     *   when eIBRS is active, RSBA becomes RRSBA (Restricted RSBA).
>>> +     *
>>> +     * - CPUs are not expected to enumerate both RSBA and RRSBA.
>>> +     *
>>> +     * Some parts (Broadwell) are not expected to ever enumerate this
>>> +     * behaviour directly.  Other parts have differing enumeration with
>>> +     * microcode version.  Fix up Xen's idea, so we can advertise them 
>>> safely
>>> +     * to guests, and so toolstacks can level a VM safety for migration.
>>> +     *
>>> +     * The following states exist:
>>> +     *
>>> +     * |   | RSBA | EIBRS | RRSBA | Notes              | Action        |
>>> +     * |---+------+-------+-------+--------------------+---------------|
>>> +     * | 1 |    0 |     0 |     0 | OK (older parts)   | Maybe +RSBA   |
>>> +     * | 2 |    0 |     0 |     1 | Broken             | +RSBA, -RRSBA |
>>> +     * | 3 |    0 |     1 |     0 | OK (pre-Aug ucode) | +RRSBA        |
>>> +     * | 4 |    0 |     1 |     1 | OK                 |               |
>>> +     * | 5 |    1 |     0 |     0 | OK                 |               |
>>> +     * | 6 |    1 |     0 |     1 | Broken             | -RRSBA        |
>>> +     * | 7 |    1 |     1 |     0 | Broken             | -RSBA, +RRSBA |
>>> +     * | 8 |    1 |     1 |     1 | Broken             | -RSBA         |
>>>       *
>>> +     * However, we doesn't need perfect adherence to the spec.  Identify 
>>> the
>>> +     * broken cases (so we stand a chance of spotting violated 
>>> assumptions),
>>> +     * and fix up Rows 1 and 3 so Xen can use RSBA || RRSBA to identify
>>> +     * "alternative predictors potentially in use".
>> Considering that it's rows 2, 6, 7, and 8 which are broken, I find this
>> comment a little misleading. To me it doesn't become clear whether them
>> subsequently being left alone (and merely a log message issued) is
>> intentional.
> 
> It is intentional.
> 
> I don't know if these combinations exist in practice anywhere or not. 
> Intel think they oughtn't to, and it's quite possible that the printk()
> is unreachable, but given the complexity and shifting meanings over time
> here, I think it would be unwise to simply assume this to be true.

I agree.

> But at the same time, if it is an unreachable code, it would be equally
> unwise to have a load of fixup code which we can't test.  I've still got
> the fixup code in a separate patch incase we need to put it back in.

Iirc the fixup code you had wasn't really "a load of code". Thing though
is: If such a combination did exist, according to our history we'd be at
least on the edge of needing to issue an XSA along with adding the
missing fixup code. From all I can tell that risk would be lower if we
had that fixup code: It might well be correct.

Nevertheless, if you decide to leave out any fixup, may I ask that you
say so very explicitly in the comment?

>>> +     */
>>> +    if ( cpu_has_eibrs ? cpu_has_rsba  /* Rows 7, 8 */
>>> +                       : cpu_has_rrsba /* Rows 2, 6 */ )
>>> +        printk(XENLOG_ERR
>>> +               "FIRMWARE BUG: CPU %02x-%02x-%02x, ucode 0x%08x: RSBA %u, 
>>> EIBRS %u, RRSBA %u\n",
>>> +               boot_cpu_data.x86, boot_cpu_data.x86_model,
>>> +               boot_cpu_data.x86_mask, ucode_rev,
>>> +               cpu_has_rsba, cpu_has_eibrs, cpu_has_rrsba);
>>> +
>>> +    /*
>>>       * Processors offering Enhanced IBRS are not guarenteed to be
>>>       * repoline-safe.
>>>       */
>>> -    if ( cpu_has_rsba || cpu_has_eibrs )
>>> +    if ( cpu_has_eibrs )
>>> +    {
>>> +        /*
>>> +         * Prior to the August 2023 microcode, many eIBRS-capable parts did
>>> +         * not enumerate RRSBA.
>>> +         */
>>> +        if ( !cpu_has_rrsba )
>>> +            setup_force_cpu_cap(X86_FEATURE_RRSBA);
>>> +
>>> +        return false;
>>> +    }
>> No clearing of RSBA in this case? I fear we may end up with misbehavior if
>> our own records aren't kept consistent with our assumptions. (This then
>> extends to the "just a log message" above as well.)
> 
> Well quite, which is why I've gone to lengths to state what our
> assumptions are.
> 
> Right now, there is nothing in Xen itself where RSBA vs RRSBA matters. 
> Until this patch, we don't even have cpu_has_rrsba, and remember that
> Xen was not vulnerable to CVE-2022-29901 (Intel Retbleed) because we
> chose to use the microcode IBRS implementation on early Skylake, rather
> than hope that Retpoline was safe enough and go with the faster option.
> 
> 
> In v1, having RSBA and RRSBA (working as I thought they were supposed to
> work) *did* matter for the default cpu-policy derivation to work nicely.
> 
> But that was invalidated by the clarification to say that RSBA and RRSBA
> should never be seen together, which in turn completely changed the
> derivation logic.
> 
> In v2, it doesn't matter if Xen ends up seeing both RSBA and RRSBA.  It
> explicitly can cope (by treating them the same WRT Retpoline), and the
> derivation logic now calculates both completely from scratch (and based
> on RSBA || RRSBA).

Like above, may I ask that you say so explicitly in the / a comment right
here?

>> Also the inner conditional continues to strike me as odd; could you add
>> half a sentence to the comment (or description) as to meaning to leave
>> is_forced_cpu_cap() function correctly (which in turn raises the question
>> whether - down the road - this is actually going to matter)?
> 
> Look at the single user of is_forced_cpu_cap().
> 
> I am not micro-optimising a single branch out of the init section on the
> blind hope that the contradictory behaviour it creates won't matter in
> the future.  Every forced cap is an abnormal case, and it's almost
> certainly my future time which will be spent unravelling the
> contradictory behaviour when it comes back to bite.

My request isn't about optimization at all, but about an apparent pattern
of unnecessary redundancy (which only as a side effect leads to the
elimination of a branch and hence some tiny bit of optimization). But if
you're sure this is going to be obvious to everyone but me, I'm not going
to insist.

>>> +    /*
>>> +     * RSBA is explicitly enumerated in some cases, but may also be set by 
>>> a
>>> +     * hypervisor to indicate that we may move to a processor which isn't
>>> +     * retpoline-safe.
>>> +     */
>>> +    if ( cpu_has_rsba )
>>>          return false;
>>>  
>>> +    /*
>>> +     * At this point, we've filtered all the legal RSBA || RRSBA cases (or 
>>> the
>>> +     * known non-ideal cases).  If ARCH_CAPS is visible, trust the absence 
>>> of
>>> +     * RSBA || RRSBA.  There's no known microcode which advertises 
>>> ARCH_CAPS
>>> +     * without RSBA or EIBRS, and if we're virtualised we can't rely the 
>>> model
>>> +     * check anyway.
>>> +     */
>> I think "no known" wants further qualification: When IBRSB was first
>> introduced, EIBRS and RSBA weren't even known about. I also didn't
>> think all hardware (i.e. sufficiently old one) did get newer ucode
>> when these started to be known. Possibly you mean "... which wrongly
>> advertises ..."?
> 
> ARCH_CAPS equally didn't exit originally.  ARCH_CAPS, RSBA and EIBRS all
> appeared together - see how they're bits 0 and 1 in the MSR.  RRSBA on
> the other hand is bit 19, which gives you some idea of how recent it is.

Hmm, yes, I see.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.