[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[patch V2 00/38] cpu/hotplug, x86: Reworked parallel CPU bringup



Hi!

This is version 2 of the reworked parallel bringup series. Version 1 can be
found here:

   https://lore.kernel.org/lkml/20230414225551.858160935@xxxxxxxxxxxxx

Background
----------

The reason why people are interested in parallel bringup is to shorten the
(kexec) reboot time of cloud servers to reduce the downtime of the VM
tenants.

The current fully serialized bringup does the following per AP:

    1) Prepare callbacks (allocate, intialize, create threads)
    2) Kick the AP alive (e.g. INIT/SIPI on x86)
    3) Wait for the AP to report alive state
    4) Let the AP continue through the atomic bringup
    5) Let the AP run the threaded bringup to full online state

There are two significant delays:

    #3 The time for an AP to report alive state in start_secondary() on x86
       has been measured in the range between 350us and 3.5ms depending on
       vendor and CPU type, BIOS microcode size etc.

    #4 The atomic bringup does the microcode update. This has been measured
       to take up to ~8ms on the primary threads depending on the microcode
       patch size to apply.

On a two socket SKL server with 56 cores (112 threads) the boot CPU spends
on current mainline about 800ms busy waiting for the APs to come up and
apply microcode. That's more than 80% of the actual onlining procedure.

By splitting the actual bringup mechanism into two parts this can be
reduced to waiting for the first AP to report alive or if the system is
large enough the first AP is already waiting when the boot CPU finished the
wake-up of the last AP. That reduces the AP bringup time on that SKL from
~800ms to ~80ms.

The actual gain varies wildly depending on the system, CPU, microcode patch
size and other factors.

The V1 cover letter has more details and a deep analysis.

Changes vs. V1:

  1) Switch APIC ID retrieval from CPUID to reading the APIC itself.

     This is required because CPUID based APIC ID retrieval can only
     provide the initial APIC ID, which might have been overruled by the
     firmware. Some AMD APUs come up with APIC ID = initial APIC ID + 0x10,
     so the APIC ID to CPU number lookup would fail miserably if based on
     CPUID. The only requirement is that the actual APIC IDs are consistent
     with the APCI/MADT table.

  2) As a consequence of #1 parallel bootup support for SEV guest has been
     dropped.

     Reading the APIC ID in a SEV guest is done via RDMSR. That RDMSR is
     intercepted and raises #VC which cannot be handled at that point as
     there is no stack and no IDT. There is no GHCB protocol for RDMSR
     like there is for CPUID. Left as an exercise for SEV wizards.

  3) Address review comments from Brian and the fallout reported by the
     kernel robot

  4) Unbreak i386 which exploded when bringing up the secondary CPUs due to
     the unconditinal load_ucode_ap() invocation in start_secondary(). That
     happens because on 32-bit load_ucode_ap() is invoked on the secondary
     CPUs from assembly code before paging is initialized and therefore
     uses physical addresses which are obviously invalid after paging is
     enabled.

  5) Small enhancements and comment updates.

  6) Rebased on Linux tree (1a5304fecee5)

The series applies on Linus tree and is also available from git:

    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git hotplug

Thanks,

        tglx
---
 Documentation/admin-guide/kernel-parameters.txt |   20 
 Documentation/core-api/cpu_hotplug.rst          |   13 
 arch/Kconfig                                    |   23 +
 arch/arm/Kconfig                                |    1 
 arch/arm/include/asm/smp.h                      |    2 
 arch/arm/kernel/smp.c                           |   18 
 arch/arm64/Kconfig                              |    1 
 arch/arm64/include/asm/smp.h                    |    2 
 arch/arm64/kernel/smp.c                         |   14 
 arch/csky/Kconfig                               |    1 
 arch/csky/include/asm/smp.h                     |    2 
 arch/csky/kernel/smp.c                          |    8 
 arch/mips/Kconfig                               |    1 
 arch/mips/cavium-octeon/smp.c                   |    1 
 arch/mips/include/asm/smp-ops.h                 |    1 
 arch/mips/kernel/smp-bmips.c                    |    1 
 arch/mips/kernel/smp-cps.c                      |   14 
 arch/mips/kernel/smp.c                          |    8 
 arch/mips/loongson64/smp.c                      |    1 
 arch/parisc/Kconfig                             |    1 
 arch/parisc/kernel/process.c                    |    4 
 arch/parisc/kernel/smp.c                        |    7 
 arch/riscv/Kconfig                              |    1 
 arch/riscv/include/asm/smp.h                    |    2 
 arch/riscv/kernel/cpu-hotplug.c                 |   14 
 arch/x86/Kconfig                                |   45 --
 arch/x86/include/asm/apic.h                     |    5 
 arch/x86/include/asm/apicdef.h                  |    5 
 arch/x86/include/asm/cpu.h                      |    5 
 arch/x86/include/asm/cpumask.h                  |    5 
 arch/x86/include/asm/processor.h                |    1 
 arch/x86/include/asm/realmode.h                 |    3 
 arch/x86/include/asm/smp.h                      |   24 -
 arch/x86/include/asm/topology.h                 |   23 -
 arch/x86/include/asm/tsc.h                      |    2 
 arch/x86/kernel/acpi/sleep.c                    |    9 
 arch/x86/kernel/apic/apic.c                     |   26 -
 arch/x86/kernel/callthunks.c                    |    4 
 arch/x86/kernel/cpu/amd.c                       |    2 
 arch/x86/kernel/cpu/cacheinfo.c                 |   21 
 arch/x86/kernel/cpu/common.c                    |   50 --
 arch/x86/kernel/cpu/topology.c                  |    3 
 arch/x86/kernel/head_32.S                       |   14 
 arch/x86/kernel/head_64.S                       |   87 +++
 arch/x86/kernel/sev.c                           |    2 
 arch/x86/kernel/smp.c                           |    3 
 arch/x86/kernel/smpboot.c                       |  526 ++++++++----------------
 arch/x86/kernel/topology.c                      |   98 ----
 arch/x86/kernel/tsc.c                           |   20 
 arch/x86/kernel/tsc_sync.c                      |   36 -
 arch/x86/power/cpu.c                            |   37 -
 arch/x86/realmode/init.c                        |    3 
 arch/x86/realmode/rm/trampoline_64.S            |   27 +
 arch/x86/xen/enlighten_hvm.c                    |   11 
 arch/x86/xen/smp_hvm.c                          |   16 
 arch/x86/xen/smp_pv.c                           |   56 +-
 drivers/acpi/processor_idle.c                   |    4 
 include/linux/cpu.h                             |    4 
 include/linux/cpuhotplug.h                      |   17 
 kernel/cpu.c                                    |  396 +++++++++++++++++-
 kernel/smp.c                                    |    2 
 kernel/smpboot.c                                |  163 -------
 62 files changed, 934 insertions(+), 982 deletions(-)



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.