[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[XEN PATCH 0/9] x86: parallelize AP bring-up during boot



Patch series available on this branch:
https://github.com/TrenchBoot/xen/tree/smp_cleanup_upstreaming

This series makes AP bring-up parallel on x86. This reduces number of
IPIs (and more importantly, delays between them) required to start all
logical processors significantly.

In order to make it possible, some state variables that were global
had to be made per-CPU. In most cases, accesses to those variables can
be performed through helper macros, some of them existed before in a
different form.

In addition to work required for parallel initialization, I've fixed
issues in error path around `machine_restart()` that were discovered
during implementation and testing.

CPU hotplug should not be affected, but I had no way of testing it.
During wakeup from S3 APs are started one by one. It should be possible
to enable parallel execution there as well, but I don't have a way of
testing it as of now.

To measure the improvement, I added output lines (identical for before
and after changes so there is no impact of printing over serial) just
above and below `if ( !pv_shim )` block. `console_timestamps=raw` was
used to get as accurate timestamp as possible, and average over 3 boots
was taken into account for each measurement. The final improvement was
calculated as (1 - after/before) * 100%, rounded to 0.01%. These are
the results:

* Dell OptiPlex 9010 with Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
  (4 cores, 4 threads): 48.83%
* MSI PRO Z790-P with 13th Gen Intel(R) Core(TM) i5-13600K (14 cores,
  20 threads, 6P+8E) `smt=on`: 36.16%
* MSI PRO Z790-P with 13th Gen Intel(R) Core(TM) i5-13600K (14 cores,
  20 threads, 6P+8E) `smt=off`: 0.25% (parking takes a lot of additional
  time)
* HP t630 Thin Client with AMD Embedded G-Series GX-420GI Radeon R7E
  (4 cores, 4 threads): 68.00%

Krystian Hebel (9):
  x86/boot: choose AP stack based on APIC ID
  x86: don't access x86_cpu_to_apicid[] directly, use
    cpu_physical_id(cpu)
  x86/smp: drop x86_cpu_to_apicid, use cpu_data[cpu].apicid instead
  x86/smp: move stack_base to cpu_data
  x86/smp: call x2apic_ap_setup() earlier
  x86/shutdown: protect against recurrent machine_restart()
  x86/smp: drop booting_cpu variable
  x86/smp: make cpu_state per-CPU
  x86/smp: start APs in parallel during boot

 xen/arch/x86/acpi/cpu_idle.c          |   4 +-
 xen/arch/x86/acpi/lib.c               |   2 +-
 xen/arch/x86/apic.c                   |   2 +-
 xen/arch/x86/boot/trampoline.S        |  20 +++
 xen/arch/x86/boot/x86_64.S            |  34 ++++-
 xen/arch/x86/cpu/mwait-idle.c         |   4 +-
 xen/arch/x86/domain.c                 |   2 +-
 xen/arch/x86/include/asm/asm_defns.h  |   2 +-
 xen/arch/x86/include/asm/cpufeature.h |   2 +
 xen/arch/x86/include/asm/processor.h  |   2 +
 xen/arch/x86/include/asm/smp.h        |   7 +-
 xen/arch/x86/mpparse.c                |   6 +-
 xen/arch/x86/numa.c                   |  17 +--
 xen/arch/x86/platform_hypercall.c     |   2 +-
 xen/arch/x86/setup.c                  |  25 +++-
 xen/arch/x86/shutdown.c               |  20 ++-
 xen/arch/x86/smpboot.c                | 207 ++++++++++++++++----------
 xen/arch/x86/spec_ctrl.c              |   2 +-
 xen/arch/x86/sysctl.c                 |   2 +-
 xen/arch/x86/traps.c                  |   4 +-
 xen/arch/x86/x86_64/asm-offsets.c     |   5 +-
 xen/include/xen/smp.h                 |   2 -
 22 files changed, 248 insertions(+), 125 deletions(-)


base-commit: fb41228ececea948c7953c8c16fe28fd65c6536b
prerequisite-patch-id: 142a87c707411d49e136c3fb76f1b14963ec6dc8
prerequisite-patch-id: f155cb7e2761deec26b76f1b8b587bc56a404c80
-- 
2.41.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.