[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Minios-devel] [UNIKRAFT PATCH v3 0/8] Save and restore extended registers on context switch



This is v3 of the context switching fixes that make sure extended registers
get saved on x86_64 if code was compiled with it.

The main changes from v2 are:
* Removed the first two patches from the series. (They have been pushed
  already.)
* Added more conditional compilation guards that were missing in
  plat/kvm/Makefile.uk
* Fixed bit for X86_CR0_TS, renamed X86_XCR0_{X,Y}MM to X86_XCR0_{SSE,AVX},
  fixed a mistake in a cpuid call in the assembly init code.
* Added -DxxxPLAT to CXXFLAGS
* Changed cpuid calls in _init_cpu_features to prevent gcc from making wrong
  assumptions about register contents. Simplified function logic.
* Changed sw_ctx.extregs to always point to the beginning of the extregs
  memory area.
* Added a patch that introduces a new Makefile variable
  NO_X86_EXTREGS_FLAGS. This holds compiler flags to make sure code is not
  compiled in a way that uses extended registers. This flag is used to
  compile plat/common/x86/traps.c and should also be used when compiling
  nontrivial interrupt handlers.

*** v2 text below for reference ***

This is v2 of the context switching fixes that make sure extended registers
get saved on x86_64 if code was compiled with it.

The main changes are:
* Rebased onto current staging
* Added two patches into the series that are basically stand-alone fixes, if
  vaguely related, but got lost and not upstreamed. ("Make mxcsr_ptr in
  entry64.S a 32-bit value" and "Clean up Makefile.uk conditional build
  rules")
* Added -mtune to ASFLAGS, which is already set for CFLAGS and CXXFLAGS
* Used this to only enable extended registers when code is compiled with
  support for them. This means the registers won't be saved and restored on
  hardware that supports it if the code wasn't compiled to use them anyway.
* Changed register usage in plat/{kvm,xen}/x86/entry64.S to reduce code size.
  Using edi and esi instead of r8 and r9, and 32-bit instructions instead
  of 64 where applicable, reduces code size of the entry code by a few byes.

*** v1 text below for reference ***

Unikraft supports compiling code with support for extended registers.
However, there is no logic in place to save and restore those registers when
switching contexts between threads. This means that multiple threads using
XMM registers will conflict.

This patch series introduces functionality to save and restore those
extended register sets for SSE (XMM) and AVX (YMM) registers. Support for
ZMM (AVX-512) registers is theoretically there, but not enabled during the
boot code, and for lack of a testing machine not currently tested.

Some remarks:

This patch series moves initialization of FP/SSE/AVX into the entry64.S
early boot code. THe way unikraft is set up, all C code is compiled with the
same flags, and even disabling all those extended command sets for setup.c
doesn't solve the problem, because the debug printing routines might use
VMOVAPS, for example. Thus, it is safer to do the enabling in assembly and
not risk #UD faults.

This patch series only enables support for x86. I remember a discussion
during the first large Arm patch series about using more than just the
generic registers for Arm. Can one of the people with deep knowledge about
the architecture comment how complicated it would be to do something similar
for Arm?
Also, the patch series is a little rough around the edges with regard to
architecture-agnostic support sw_ctx.h and sw_ctx.c. However, since there is
no threading support for Arm yet, these files aren't used by Arm at all at
the moment, and revisiting them at that point shouldn't be too hard.

Finally, I also invested some time into investigating a lazy swiching
routine, with threads only starting to save their extended register context
once they first use instructions from the extended instruction sets. While
lazy switching is not very popular any more, I figured in a unikernel, it
might still be useful, especially since we don't have to worry about
information leaking, which is one of the issues with it on general-purpose
OSs.
However, this requires switching off SSE/AVX/etc. when switching to a fresh
thread, so that the #UD fault can be trapped to find out when a thread
started using exended instructions, and potentially back and forth
on every thread context switch. Enabling and disabling these options
requires writing to CR0 and CR4, which is excruciatingly slow on KVM
compared to an XSAVE (by about a factor 20 on my test machine), and while
the difference between the two isn't quite as bad on Xen-PV, it's still not
great. I shelved this for now and decided to go with eating the XSAVE
overhead on every switch instead, which also makes for much more compact
logic.

*** end v1 text ***

Florian Schmidt (8):
  plat/{kvm,xen}: Clean up Makefile.uk conditional build rules
  plat: check for and enable extended CPU features
  plat: Add -DxxxPLAT define for each platform
  plat/common: add include guards to include/x86/cpu.h
  plat: Add global struct to keep x86 CPU information
  plat/common: Add functionality to save and restore extended (x86)
    registers
  arch/x86: Introduce NO_X86_EXTREGS_FLAGS
  plat/common: Add a notice regarding trap handling

 arch/x86/x86_64/Makefile.uk        |   3 +
 plat/common/include/sw_ctx.h       |   8 ++-
 plat/common/include/x86/cpu.h      | 105 +++++++++++++++++++++++++++--
 plat/common/include/x86/cpu_defs.h |  22 ++++++
 plat/common/sw_ctx.c               |  18 ++++-
 plat/common/x86/cpu_features.c     |  37 ++++++++++
 plat/common/x86/traps.c            |  12 ++++
 plat/kvm/Makefile.uk               |  12 ++--
 plat/kvm/x86/entry64.S             |  58 +++++++++++++---
 plat/kvm/x86/setup.c               |  17 +----
 plat/kvm/x86/time.c                |   4 ++
 plat/linuxu/Makefile.uk            |   5 ++
 plat/linuxu/setup.c                |   7 ++
 plat/xen/Makefile.uk               |  41 +++++------
 plat/xen/x86/entry64.S             |  68 +++++++++++++++++--
 plat/xen/x86/setup.c               |  15 +----
 16 files changed, 356 insertions(+), 76 deletions(-)
 create mode 100644 plat/common/x86/cpu_features.c

-- 
2.20.1


_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.