[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] x86: clear RDRAND CPUID bit on AMD family 15h/16h



On Mon, Mar 09, 2020 at 10:08:50AM +0100, Jan Beulich wrote:
> Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:
> 
>     There have been reports of RDRAND issues after resuming from suspend on
>     some AMD family 15h and family 16h systems. This issue stems from a BIOS
>     not performing the proper steps during resume to ensure RDRAND continues
>     to function properly.
> 
>     Update the CPU initialization to clear the RDRAND CPUID bit for any family
>     15h and 16h processor that supports RDRAND. If it is known that the family
>     15h or family 16h system does not have an RDRAND resume issue or that the
>     system will not be placed in suspend, the "cpuid=rdrand" kernel parameter
>     can be used to stop the clearing of the RDRAND CPUID bit.
> 
>     Note, that clearing the RDRAND CPUID bit does not prevent a processor
>     that normally supports the RDRAND instruction from executing it. So any
>     code that determined the support based on family and model won't #UD.
> 
> Warn if no explicit choice was given on affected hardware.
> 
> Check RDRAND functions at boot as well as after S3 resume (the retry
> limit chosen is entirely arbitrary).
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> ---
> Still slightly RFC, and still in particular because of the change to
> parse_xen_cpuid(): Alternative approach suggestions are welcome. But now
> also because with many CPUs there may now be a lot of warnings in case
> of issues.
> ---
> v4: Check always, including during boot. Slightly better sanity check,
>     inspired by Linux commit 7879fc4bdc7.
> v3: Add call to warning_add(). If force-enabled, check RDRAND still
>     functioning after S3 resume.
> v2: Re-base.
> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -488,6 +488,10 @@ The Speculation Control hardware feature
>  be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
>  won't offer them to guests.
>  
> +`rdrand` can be used to override the default disabling of the feature on 
> certain
> +AMD systems.  Its negative form can of course also be used to suppress use 
> and
> +exposure of the feature.
> +
>  ### cpuid_mask_cpu
>  > `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b`
>  
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -4,6 +4,7 @@
>  #include <xen/param.h>
>  #include <xen/smp.h>
>  #include <xen/pci.h>
> +#include <xen/warning.h>
>  #include <asm/io.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -646,6 +647,25 @@ static void init_amd(struct cpuinfo_x86
>               if (acpi_smi_cmd && (acpi_enable_value | acpi_disable_value))
>                       amd_acpi_c1e_quirk = true;
>               break;
> +
> +     case 0x15: case 0x16:
> +             /*
> +              * There are too many Fam15/Fam16 systems where upon resume
> +              * from S3 firmware fails to re-setup properly functioning
> +              * RDRAND.  Clear the feature unless force-enabled on the
> +              * command line.
> +              */
> +             if (c == &boot_cpu_data &&
> +                 cpu_has(c, X86_FEATURE_RDRAND) &&
> +                 !is_forced_cpu_cap(X86_FEATURE_RDRAND)) {

Given this is the only user of is_forced_cpu_cap...

> +                     static const char __initconst text[] =
> +                             "RDRAND may cease to work on this hardware upon 
> resume from S3.\n"
> +                             "Please choose an explicit cpuid={no-}rdrand 
> setting.\n";
> +
> +                     setup_clear_cpu_cap(X86_FEATURE_RDRAND);
> +                     warning_add(text);
> +             }
> +             break;
>       }
>  
>       display_cacheinfo(c);
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -11,6 +11,7 @@
>  #include <asm/io.h>
>  #include <asm/mpspec.h>
>  #include <asm/apic.h>
> +#include <asm/random.h>
>  #include <asm/setup.h>
>  #include <mach_apic.h>
>  #include <public/sysctl.h> /* for XEN_INVALID_{SOCKET,CORE}_ID */
> @@ -98,6 +99,11 @@ void __init setup_force_cpu_cap(unsigned
>       __set_bit(cap, boot_cpu_data.x86_capability);
>  }
>  
> +bool is_forced_cpu_cap(unsigned int cap)

... I think this could be made __init?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.