[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/3] x86/spec-ctrl: Fix up the RSBA/RRSBA bits as appropriate


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 2 Jun 2023 11:56:08 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wyZhtGTJFTtSZ8JXB4wcO9Bifjrd/68+N7yM/LGYVp4=; b=aIT1AYFy/LAw8JN0Om65KfZuP3U8HUnZM4zz+x9JP5o/Lws2jfRhRjxx2kN7tSZyAbdgAwDsKiTmg8MEg9UfzY+jjLCpJhMwvN+ue8YVU1w/GfIqjzozBorkzlXOk7EjUD9ZIEZYbT4YRHtcs1ajSid/UmBYwakJMeV0U6+Abwf2kvYkZ2LA6Z00bffsLd8LvG8uZt8PgYOcc5I9Rpf43I0Z9AGHBkgAcdCQW/aH9OhyelUEYPt0spu8FIrvw4mspYETJLVqmTaEILmJ+6Em3y7sBFiGiUsFZP7HN8VWoWB09r9nsAl70DLpz9eNPWzdQydPpqISPJh7a8oXNqlVQQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RiOATDrUcPNCpTD2XUZbXp2pjkAc2Og05digDQLo6W2CnwKnG790VkqRR32aFYgdxGlpXLg5BmAstxOCJYf3mcbpv+Ur75KeKHn3UmlO2VKIuAsC4tKeerc6YoEe/S2hskTJj48NCKacsCSGWOJWh4kVUHJxcFl9R40+JUFgkBMesyCctPs5Kd4J52gLaPoCdTKLmTkOegoV5C+aMon5BoDI+ZHuxlC7DVAdd2QTq88GoXmlk8D7eil2QcIG83tCNARFOs0THgdEiMNLM39lZ8ckBV+W6ODAk/6y3cw0t6afWJhyVeTX91n2YpY4oC+ZTS8GK1RnaYdhKh5vTAtwrA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 02 Jun 2023 09:56:28 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.06.2023 16:48, Andrew Cooper wrote:
> @@ -593,15 +596,85 @@ static bool __init retpoline_calculations(void)
>          return false;
>  
>      /*
> -     * RSBA may be set by a hypervisor to indicate that we may move to a
> -     * processor which isn't retpoline-safe.
> +     * The meaning of the RSBA and RRSBA bits have evolved over time.  The
> +     * agreed upon meaning at the time of writing (May 2023) is thus:
> +     *
> +     * - RSBA (RSB Alternative) means that an RSB may fall back to an
> +     *   alternative predictor on underflow.  Skylake uarch and later all 
> have
> +     *   this property.  Broadwell too, when running microcode versions prior
> +     *   to Jan 2018.
> +     *
> +     * - All eIBRS-capable processors suffer RSBA, but eIBRS also introduces
> +     *   tagging of predictions with the mode in which they were learned.  So
> +     *   when eIBRS is active, RSBA becomes RRSBA (Restricted RSBA).
> +     *
> +     * - CPUs are not expected to enumerate both RSBA and RRSBA.
> +     *
> +     * Some parts (Broadwell) are not expected to ever enumerate this
> +     * behaviour directly.  Other parts have differing enumeration with
> +     * microcode version.  Fix up Xen's idea, so we can advertise them safely
> +     * to guests, and so toolstacks can level a VM safety for migration.
> +     *
> +     * The following states exist:
> +     *
> +     * |   | RSBA | EIBRS | RRSBA | Notes              | Action        |
> +     * |---+------+-------+-------+--------------------+---------------|
> +     * | 1 |    0 |     0 |     0 | OK (older parts)   | Maybe +RSBA   |
> +     * | 2 |    0 |     0 |     1 | Broken             | +RSBA, -RRSBA |
> +     * | 3 |    0 |     1 |     0 | OK (pre-Aug ucode) | +RRSBA        |
> +     * | 4 |    0 |     1 |     1 | OK                 |               |
> +     * | 5 |    1 |     0 |     0 | OK                 |               |
> +     * | 6 |    1 |     0 |     1 | Broken             | -RRSBA        |
> +     * | 7 |    1 |     1 |     0 | Broken             | -RSBA, +RRSBA |
> +     * | 8 |    1 |     1 |     1 | Broken             | -RSBA         |
>       *
> +     * However, we doesn't need perfect adherence to the spec.  Identify the

Nit: "don't" or "it"?

> +     * broken cases (so we stand a chance of spotting violated assumptions),
> +     * and fix up Rows 1 and 3 so Xen can use RSBA || RRSBA to identify
> +     * "alternative predictors potentially in use".

Considering that it's rows 2, 6, 7, and 8 which are broken, I find this
comment a little misleading. To me it doesn't become clear whether them
subsequently being left alone (and merely a log message issued) is
intentional.

> +     */
> +    if ( cpu_has_eibrs ? cpu_has_rsba  /* Rows 7, 8 */
> +                       : cpu_has_rrsba /* Rows 2, 6 */ )
> +        printk(XENLOG_ERR
> +               "FIRMWARE BUG: CPU %02x-%02x-%02x, ucode 0x%08x: RSBA %u, 
> EIBRS %u, RRSBA %u\n",
> +               boot_cpu_data.x86, boot_cpu_data.x86_model,
> +               boot_cpu_data.x86_mask, ucode_rev,
> +               cpu_has_rsba, cpu_has_eibrs, cpu_has_rrsba);
> +
> +    /*
>       * Processors offering Enhanced IBRS are not guarenteed to be
>       * repoline-safe.
>       */
> -    if ( cpu_has_rsba || cpu_has_eibrs )
> +    if ( cpu_has_eibrs )
> +    {
> +        /*
> +         * Prior to the August 2023 microcode, many eIBRS-capable parts did
> +         * not enumerate RRSBA.
> +         */
> +        if ( !cpu_has_rrsba )
> +            setup_force_cpu_cap(X86_FEATURE_RRSBA);
> +
> +        return false;
> +    }

No clearing of RSBA in this case? I fear we may end up with misbehavior if
our own records aren't kept consistent with our assumptions. (This then
extends to the "just a log message" above as well.)

Also the inner conditional continues to strike me as odd; could you add
half a sentence to the comment (or description) as to meaning to leave
is_forced_cpu_cap() function correctly (which in turn raises the question
whether - down the road - this is actually going to matter)?

> +    /*
> +     * RSBA is explicitly enumerated in some cases, but may also be set by a
> +     * hypervisor to indicate that we may move to a processor which isn't
> +     * retpoline-safe.
> +     */
> +    if ( cpu_has_rsba )
>          return false;
>  
> +    /*
> +     * At this point, we've filtered all the legal RSBA || RRSBA cases (or 
> the
> +     * known non-ideal cases).  If ARCH_CAPS is visible, trust the absence of
> +     * RSBA || RRSBA.  There's no known microcode which advertises ARCH_CAPS
> +     * without RSBA or EIBRS, and if we're virtualised we can't rely the 
> model
> +     * check anyway.
> +     */

I think "no known" wants further qualification: When IBRSB was first
introduced, EIBRS and RSBA weren't even known about. I also didn't
think all hardware (i.e. sufficiently old one) did get newer ucode
when these started to be known. Possibly you mean "... which wrongly
advertises ..."?

> @@ -689,6 +762,15 @@ static bool __init retpoline_calculations(void)
>          break;
>      }
>  
> +    if ( !safe )
> +    {
> +        /*
> +         * Note: the eIBRS-capable parts are filtered out earlier, so the
> +         * remainder here are the ones which suffer only RSBA behaviour.
> +         */

As I think I had mentioned already, I find "only" here odd, when RSBA
is more severe than RRSBA. Maybe the "only" could move earlier, e.g.
between "are" and "the"? Then again this may be another non-native-
speaker issue of mine ...

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.