[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH v8 0/9] xen/x86: various XPTI speedups
This patch series aims at reducing the overhead of the XPTI Meltdown mitigation. Patch 1 had been posted before, the main changes in this patch are due to addressing Jan's comments on my first version. The main objective of that patch is to avoid copying the L4 page table each time the guest is being activated, as often the contents didn't change while the hypervisor was active. Patch 2 adds a new helper for writing cr3 instead of open coding the inline assembly in multiple places. Patch 3 sets the stage for being able to activate XPTI per domain. As a first step it is now possible to switch XPTI off for dom0 via the xpti boot parameter. Patch 4 adds support for using the INVPCID instruction for flushing the TLB. Patch 5 reduces the costs of TLB flushes even further: as we don't make any use of global TLB entries with XPTI being active we can avoid removing all global TLB entries on TLB flushes by simply deactivating the global pages in CR4. Patch 6 prepares using PCIDs in patch 6. For that purpose it was necessary to allow CR3 values with bit 63 set in order to avoid flushing TLB entries when writing CR3. This requires a modification of Jan's rather clever state machine with positive and negative CR3 values for the hypervisor by using a dedicated flag byte instead. Patch 7 converts pv_guest_cr4_to_real_cr4() from a macro to a function as it was becoming more and more complex. Patch 8 adds some PCID helper functions for accessing the different parts of cr3 (address and pcid part). Patch 9 is the main performance contributor: by making use of the PCID feature (if available) TLB entries can survive CR3 switches. The TLB needs to be flushed on context switches only and not when switching between guest and hypervisor or guest kernel and user mode. On my machine (Intel i7-4600M) using the PCID feature in the non-XPTI case showed a slightly worse performance than using global pages instead (using PCID and global pages is a bad idea as invalidating global pages in this case would need a complete TLB flush). For this reason I've decided to use PCID for XPTI only as the default. That can easily be changed by using the command line parameter "pcid=true". The complete series has been verified to still mitigate against Meltdown attacks. A simple performance test (make -j 4 in the Xen hypervisor directory) showed significant improvements compared to the state without this series. Numbers are seconds, stddev in braces. xpti=false elapsed system user unpatched: 88.42 ( 2.01) 94.49 ( 1.38) 180.40 ( 1.41) patched : 89.45 ( 3.10) 96.47 ( 3.22) 181.34 ( 1.98) xpti=true elapsed system user unpatched: 113.43 ( 3.68) 165.44 ( 4.41) 183.30 ( 1.72) patched : 92.76 ( 2.11) 103.39 ( 1.13) 184.86 ( 0.12) Changes since last version: - patch 1: set root_pgt_changed flag on other cpus, too, when changing a shadow L4 entry - patch 3: shadow code needs to check xpti flag now due to change in patch 1 Juergen Gross (9): x86/xpti: avoid copying L4 page table contents when possible xen/x86: add a function for modifying cr3 xen/x86: support per-domain flag for xpti xen/x86: use invpcid for flushing the TLB xen/x86: disable global pages for domains with XPTI active xen/x86: use flag byte for decision whether xen_cr3 is valid xen/x86: convert pv_guest_cr4_to_real_cr4() to a function xen/x86: add some cr3 helpers xen/x86: use PCID feature docs/misc/xen-command-line.markdown | 37 +++++++++++- xen/arch/x86/cpu/mtrr/generic.c | 37 ++++++++---- xen/arch/x86/debug.c | 2 +- xen/arch/x86/domain.c | 6 +- xen/arch/x86/domain_page.c | 2 +- xen/arch/x86/flushtlb.c | 111 ++++++++++++++++++++++++++++++------ xen/arch/x86/mm.c | 86 +++++++++++++++++++++++----- xen/arch/x86/mm/shadow/multi.c | 10 ++++ xen/arch/x86/pv/dom0_build.c | 8 ++- xen/arch/x86/pv/domain.c | 89 ++++++++++++++++++++++++++++- xen/arch/x86/setup.c | 27 +++------ xen/arch/x86/smp.c | 2 +- xen/arch/x86/smpboot.c | 6 +- xen/arch/x86/spec_ctrl.c | 70 +++++++++++++++++++++++ xen/arch/x86/x86_64/asm-offsets.c | 2 + xen/arch/x86/x86_64/compat/entry.S | 5 +- xen/arch/x86/x86_64/entry.S | 78 +++++++++++-------------- xen/common/efi/runtime.c | 4 +- xen/include/asm-x86/current.h | 23 ++++++-- xen/include/asm-x86/domain.h | 17 +++--- xen/include/asm-x86/flushtlb.h | 7 ++- xen/include/asm-x86/invpcid.h | 2 + xen/include/asm-x86/processor.h | 18 ++++++ xen/include/asm-x86/pv/domain.h | 31 ++++++++++ xen/include/asm-x86/spec_ctrl.h | 4 ++ xen/include/asm-x86/x86-defns.h | 4 +- 26 files changed, 543 insertions(+), 145 deletions(-) -- 2.13.6 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |