[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] x86: clear RDRAND CPUID bit on AMD family 15h/16h



On 09/03/2020 09:08, Jan Beulich wrote:
> Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:
>
>     There have been reports of RDRAND issues after resuming from suspend on
>     some AMD family 15h and family 16h systems. This issue stems from a BIOS
>     not performing the proper steps during resume to ensure RDRAND continues
>     to function properly.
>
>     Update the CPU initialization to clear the RDRAND CPUID bit for any family
>     15h and 16h processor that supports RDRAND. If it is known that the family
>     15h or family 16h system does not have an RDRAND resume issue or that the
>     system will not be placed in suspend, the "cpuid=rdrand" kernel parameter

I'm not sure what is best to do here.  The type suggests that this is a
verbatim copy of the Linux commit message, but this tiny detail is Xen
specific.

>     can be used to stop the clearing of the RDRAND CPUID bit.
>
>     Note, that clearing the RDRAND CPUID bit does not prevent a processor
>     that normally supports the RDRAND instruction from executing it. So any
>     code that determined the support based on family and model won't #UD.
>
> Warn if no explicit choice was given on affected hardware.
>
> Check RDRAND functions at boot as well as after S3 resume (the retry
> limit chosen is entirely arbitrary).
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

CC'ing Paul & George for 4.14 things:

tl;dr Some AMD system firmware doesn't reinitialise the RNG
configuration on S3, resulting in the RDRAND instruction returning -1
and "data good".

AMD's recommended mitigation is to disable RDRAND entirely.  We can't
identify the problem before an S3 resume and finding the instruction
broken, and can't repair the damage at that point (The MSR needing
fixing has already been locked), and AMD don't have a list of DMI
quirks/equivalent to use to identify the buggy systems on boot.

Also, there is no public statement from AMD on the matter, but one
obvious fallout on Linux systems is systemd looping forever trying to
create a UUID which doesn't already alias in its database.

The impact of this would be substantially less bad if Xen could identify
(a lack of) platform support for S3 at boot, as AMD servers have never
supported S3.  This would reduce the user-nagging to only client
systems, but there isn't an obvious way to do this.

It's a complete mess, but I don't think we should put this out without
some form of release note.

> ---
> Still slightly RFC, and still in particular because of the change to
> parse_xen_cpuid():

FWIW, that is very similar to XenServer's AVX512 off-by-default bodge
until the default vs max policy work is ready.

I don't have a better suggestion right now, but hopefully something
better might become obvious when we've got more users.  Either way, I'm
expecting it to turn into a "switch ( mid->bit )" expression in due course.

> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -4,6 +4,7 @@
>  #include <xen/param.h>
>  #include <xen/smp.h>
>  #include <xen/pci.h>
> +#include <xen/warning.h>
>  #include <asm/io.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -646,6 +647,25 @@ static void init_amd(struct cpuinfo_x86
>               if (acpi_smi_cmd && (acpi_enable_value | acpi_disable_value))
>                       amd_acpi_c1e_quirk = true;
>               break;
> +
> +     case 0x15: case 0x16:
> +             /*
> +              * There are too many Fam15/Fam16 systems where upon resume

"some" systems.

> +              * from S3 firmware fails to re-setup properly functioning
> +              * RDRAND.

I think this needs another sentence of explanation.

By the time we can spot the problem, it is too late to take evasive
action, and there is nothing Xen can do to repair the problem.

>   Clear the feature unless force-enabled on the
> +              * command line.
> +              */
> +             if (c == &boot_cpu_data &&
> +                 cpu_has(c, X86_FEATURE_RDRAND) &&
> +                 !is_forced_cpu_cap(X86_FEATURE_RDRAND)) {
> +                     static const char __initconst text[] =
> +                             "RDRAND may cease to work on this hardware upon 
> resume from S3.\n"
> +                             "Please choose an explicit cpuid={no-}rdrand 
> setting.\n";
> +
> +                     setup_clear_cpu_cap(X86_FEATURE_RDRAND);
> +                     warning_add(text);

What do you think to clobbering RDRAND via the CPUMASK registers in this
case?  We've got full control there, and it would stop PV userspace as well.

> +             }
> +             break;
>       }
>  
>       display_cacheinfo(c);
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -11,6 +11,7 @@
>  #include <asm/io.h>
>  #include <asm/mpspec.h>
>  #include <asm/apic.h>
> +#include <asm/random.h>
>  #include <asm/setup.h>
>  #include <mach_apic.h>
>  #include <public/sysctl.h> /* for XEN_INVALID_{SOCKET,CORE}_ID */
> @@ -98,6 +99,11 @@ void __init setup_force_cpu_cap(unsigned
>       __set_bit(cap, boot_cpu_data.x86_capability);
>  }
>  
> +bool is_forced_cpu_cap(unsigned int cap)
> +{
> +     return test_bit(cap, forced_caps);
> +}
> +
>  static void default_init(struct cpuinfo_x86 * c)
>  {
>       /* Not much we can do here... */
> @@ -498,6 +504,28 @@ void identify_cpu(struct cpuinfo_x86 *c)
>       printk("\n");
>  #endif
>  
> +     /*
> +      * If RDRAND is available, make an attempt to check that it actually
> +      * (still) works.
> +      */

Do you think it would be helpful to test in the opposite case as well. 
If we come back from S3 and find that RDRAND does actually work, then it
is safe to tell the user that they can re-enable.

> +     if (cpu_has(c, X86_FEATURE_RDRAND)) {
> +             unsigned int prev = 0;
> +
> +             for (i = 0; i < 5; ++i)
> +             {
> +                     unsigned int cur = arch_get_random();
> +
> +                     if (prev && cur != prev)
> +                             break;
> +                     prev = cur;
> +                     cpu_relax();

Why the relax?  We're not polling hammering the memory bus waiting for
an unknown period of time until something changes.

We simply need up to 5 random numbers as fast as the RNG can produce
them (which is actually quite slow.  RDRAND has ~350 cycle latency minimum.)

~Andrew

> +             }
> +
> +             if (i >= 5)
> +                     printk(XENLOG_WARNING "CPU%u: RDRAND appears to not 
> work\n",
> +                            smp_processor_id());
> +     }
> +
>       if (system_state == SYS_STATE_resume)
>               return;
>  




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.