[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 04/12] microcode: introduce a global cache of ucode patch



On Mon, Mar 11, 2019 at 03:57:28PM +0800, Chao Gao wrote:
> to replace the current per-cpu cache 'uci->mc'.
> 
> Compared to the current per-cpu cache, the benefits of the global
> microcode cache are:
> 1. It reduces the work that need to be done on each CPU. Parsing ucode
> file is done once on one CPU. Other CPUs needn't parse ucode file.
> Instead, they can find out and load the newest patch from the global
> cache.
> 2. It reduces the memory consumption on a system with many CPU cores.
> 
> Two functions, microcode_save_patch() and microcode_find_patch() are
> introduced. The former adds one given patch to the global cache. The
> latter gets a newer and matched ucode patch from the global cache.
> All operations on the cache is expected to be done with the
> 'microcode_mutex' hold.
> 
> Note that I deliberately avoid touching 'uci->mc' as I am going to
> remove it completely in the next patch.
> 
> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
> ---
> Changes in v6:
>  - constify local variables and function parameters if possible
>  - comment that the global cache is protected by 'microcode_mutex'.
>    and add assertions to catch violations in microcode_{save/find}_patch()
> 
> Changes in v5:
>  - reword the commit description
>  - find_patch() and save_patch() are abstracted into common functions
>    with some hooks for AMD and Intel
> ---
>  xen/arch/x86/microcode.c        | 60 +++++++++++++++++++++++++++
>  xen/arch/x86/microcode_amd.c    | 91 
> +++++++++++++++++++++++++++++++++++++----
>  xen/arch/x86/microcode_intel.c  | 66 ++++++++++++++++++++++++++----
>  xen/include/asm-x86/microcode.h | 13 ++++++
>  4 files changed, 215 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 4163f50..e629e6c 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
>   */
>  static bool_t __initdata ucode_scan;
>  
> +static LIST_HEAD(microcode_cache);
> +
>  void __init microcode_set_module(unsigned int idx)
>  {
>      ucode_mod_idx = idx;
> @@ -208,6 +210,64 @@ static void microcode_fini_cpu(unsigned int cpu)
>      spin_unlock(&microcode_mutex);
>  }
>  
> +/*
> + * Save an ucode patch to the global cache list.
> + *
> + * Return true if a patch is added to the global cache or it replaces
> + * one old patch in the cache. Otherwise, return false. According to
> + * the return value, the caller decides not to do an expensive ucode
> + * update for some cases (i.e. when admin provides an old ucode).
> + */
> +bool microcode_save_patch(struct microcode_patch *new)
> +{
> +    struct microcode_patch *old;
> +
> +    ASSERT(spin_is_locked(&microcode_mutex));
> +
> +    list_for_each_entry(old, &microcode_cache, list)
> +    {
> +        enum microcode_match_result result =
> +            microcode_ops->compare_patch(new, old);
> +
> +        if ( result == OLD_UCODE )
> +        {
> +            microcode_ops->free_patch(new);
> +            return false;
> +        }
> +        else if ( result == NEW_UCODE )
> +        {
> +            list_replace(&old->list, &new->list);
> +            microcode_ops->free_patch(old);
> +            return true;
> +        }
> +        else /* result == MIS_UCODE */
> +            continue;

I think this is rather too simplistic, and I'm not sure you really
need a list in order to store the microcodes.

IIRC we agreed that systems with mixed CPU versions are not supported,
hence the same microcode blob should be used to update all the
possible CPUs on the system, so a list should not be needed here.

Also I'm afraid that freeing the old microcode when a new version is
uploaded is not fully correct. For example given the following
scenario:

1. Upload a microcode blob to the hypervisor and apply it to online
CPUs.

2. Upload a microcode blob with a higher version than the previous one,
but which fails to apply. This microcode would replace the
previous one.

3. Online a CPU. This CPU will try to use the last uploaded microcode
and fail, because last uploaded version is broken. Newly onlined CPUs
would then end up with a microcode version different from the
currently running CPUs, likely breaking the system.

I think the best way to solve this is to ditch the list usage and
instead only keep at most two microcode versions cached in the
hypervisor:

 - The microcode version that's currently successfully applied (if any).
 - A microcode version higher than the current version, that has yet
   to be applied.

Once this new microcode version has been applied it will replace the
previously applied version. If the new microcode version fails to
apply it will be discarded, thus keeping a copy of the currently
applied microcode version.

With this approach AFAICT you only need two variables, one to store
the currently applied microcode_patch and another one to save the new
microcode version in order to attempt to apply it.

I think this will also simplify some of the code. Let me know if this
sounds sensible, or if I'm missing something.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.