[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/microcode: Support builtin CPU microcode



On 13.12.19 14:40, Andrew Cooper wrote:
On 09/12/2019 21:49, Eslam Elnikety wrote:
+
+extern const char __builtin_intel_ucode_start[],
__builtin_intel_ucode_end[];
+extern const char __builtin_amd_ucode_start[],
__builtin_amd_ucode_end[];
+#endif
+
   /* By default, ucode loading is done in NMI handler */
   static bool ucode_in_nmi = true;
   @@ -110,9 +118,9 @@ void __init microcode_set_module(unsigned int
idx)
   }
     /*
- * The format is '[<integer>|scan=<bool>, nmi=<bool>]'. Both
options are
- * optional. If the EFI has forced which of the multiboot payloads
is to be
- * used, only nmi=<bool> is parsed.
+ * The format is '[<integer>|scan=<bool>|builtin=<bool>,
nmi=<bool>]'. All
+ * options are optional. If the EFI has forced which of the
multiboot payloads
+ * is to be used, only nmi=<bool> is parsed.
    */

Please delete this, or I'll do a prereq patch to fix it and the command
line docs.  (Both are in a poor state.)


Unless you are planning that along your on-going
docs/hypervisor-guide/microcode-loading.rst effort, I can pick up this
clean-up/prereq patch myself. What do you have in mind? (Or point me
to a good example and I will figure things out).

c/s 3c5552954, 53a84f672, 633a40947 or 3136dee9c are good examples.
ucode= is definitely more complicated to explain because of its implicit
EFI behaviour.


Currently massaging a patch to that effect.

+    else if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+        ucode_blob.size = (size_t)(__builtin_intel_ucode_end
+                                   - __builtin_intel_ucode_start);
+    else
+        return;
+
+    if ( !ucode_blob.size )
+    {
+        printk("No builtin ucode! 'ucode=builtin' is nullified.\n");
+        return;
+    }
+    else if ( ucode_blob.size > MAX_EARLY_CPIO_MICROCODE )
+    {
+        printk("Builtin microcode payload too big! (%ld, we can do
%d)\n",
+               ucode_blob.size, MAX_EARLY_CPIO_MICROCODE);
+        ucode_blob.size = 0;
+        return;
+    }
+
+    ucode_blob.data = xmalloc_bytes(ucode_blob.size);
+    if ( !ucode_blob.data )
+        return;

Any chance we can reuse the "fits" logic to avoid holding every
inapplicable blob in memory as well?


I think this would be a welcomed change. It seems to me that we have
two ways to go about it.

1) We factor the code in the intel-/amd-specific cpu_request_microcode
to extract logic for finding a match into its own new function, expose
that through microcode_ops, and finally do xalloc only for the
matching microcode when early loading is scan or builtin.

2) Cannot we just do away completely with xalloc? I see that each
individual microcode update gets allocated anyway in
microcode_intel.c/get_next_ucode_from_buffer() and in
microcode_amd.c/cpu_request_microcode(). Unless I am missing
something, the xmalloc_bytes for ucode_blob.data is redundant.

Thoughts?

I'm certain the code is more complicated than it needs to be.
Cleanup/simplification would be very welcome.  And if you're up for
that, there is a related area which would be a great improvement.

At the moment, BSP microcode loading is very late because it depends on
this xmalloc() to begin with.  However, no memory allocation is needed
to load microcode from a multiboot module or from the initrd, or from
this future builtin location - all loading can be done from a
directmap/bootmap pointer if needs be.

This would allow moving the BSP microcode to much earlier on boot,
probably somewhere between console setup and E820 handling.

One way or another, the microcode cache which persists past boot has to
be xmalloc()'d, because we will free the module/initrd/builtin.  It
would however be more friendly to AP's to only give them the single
correct piece of ucode, rather than everything to scan through.

(These behaviours and expectations are going to be a chunk of my
intended second microcode.rst doc, including a "be aware that machines
exist which do $X" section to cover some of the weirder corner cases we
have encountered.)


Avoiding the xmalloc/memcpy on the scan for microcode is one of the patches that I will share shortly. In particular, the ucode_blob.data would directly point to the buffer matching the canonical name within the cpio name space.

We are still a bit away from pushing the BSP microcode update earlier though. We will need to surgically remove all the unnecessary xmalloc/memcpy from within microcode_{amd,intel}.c. Also, as you hinted, the challenging bit is the per-cpu microcode cache.

+
+builtin_ucode.o: Makefile $(amd-blobs) $(intel-blobs)
+    # Create AMD microcode blob if there are AMD updates on the
build system
+    if [ ! -z "$(amd-blobs)" ]; then \
+        cat $(amd-blobs) > $@.bin ; \
+        $(OBJCOPY) -I binary -O elf64-x86-64 -B i386:x86-64
--rename-section
.data=.builtin_amd_ucode,alloc,load,readonly,data,contents $@.bin
$@.amd; \
+        rm -f $@.bin; \
+    fi
+    # Create INTEL microcode blob if there are INTEL updates on the
build system
+    if [ ! -z "$(intel-blobs)" ]; then \
+        cat $(intel-blobs) > $@.bin; \
+        $(OBJCOPY) -I binary -O elf64-x86-64 -B i386:x86-64
--rename-section
.data=.builtin_intel_ucode,alloc,load,readonly,data,contents $@.bin
$@.intel; \
+        rm -f $@.bin; \
+    fi
+    # Create fake builtin_ucode.o if no updates were present.
Otherwise, builtin_ucode.o carries the available updates
+    if [ -z "$(amd-blobs)" -a -z "$(intel-blobs)" ]; then \
+        $(CC) $(CFLAGS) -c -x c /dev/null -o $@; \
+    else \
+        $(LD) $(LDFLAGS) -r -o $@ $@.*; \
+        rm -f $@.*; \
+    fi

How about using weak symbols, rather than playing games like this?

Just to make sure we are on the same page. You are after a dummy
binary with weak symbols that eventually get overridden when I link
the actual microcode binaries into builtin_ucode.o? If so, possible of
course. Except that I do not particularly see the downside of the
existing approach with dummy builtin_ucode.o.

Actually, you don't even need week symbols.  Size being 0 means that no
blob was inserted.

There doesn't appear to be a need to organise a dummy builtin_ucode.o,
or to manually merge Intel/AMD together.  Simply make obj-y +=
ucode-$VENDOR.o dependent on there being some blob to insert.

I have reworked this part in v2 such that the configurations specify explicitly the individual microcode blobs to include. I have also adopted the "obj-y += ucode-$VENDOR.o" and made it dependent on the corresponding blobs being available. That said, I was not able to get rid of the dummy object. The dummy is still needed in case no amd nor intel ucode blobs were specified. In case of no microcode blobs, obj-y will not refer to any dependency within xen/arch/x86/microcode/ and there will be no rule to generate microcode/built_in.o (which is required for all subdir in xen/arch/x86/). Of course, we can do logic in xen/arch/x86/Makefile to mark microcode as a subdir iff there are microcode blobs available, but it seems to me that this logic does not belong there. Also, my initial attempt at this quickly proved that the dummy approach is way simpler.

-- Eslam


~Andrew



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.