[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear

To: linux-kernel@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx
From: Juergen Gross <jgross@xxxxxxxx>
Date: Tue, 21 Aug 2018 17:37:55 +0200
Cc: Juergen Gross <jgross@xxxxxxxx>, boris.ostrovsky@xxxxxxxxxx, mingo@xxxxxxxxxx, tglx@xxxxxxxxxxxxx, hpa@xxxxxxxxx
Delivery-date: Tue, 21 Aug 2018 15:38:14 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Using only 32-bit writes for the pte will result in an intermediate
L1TF vulnerable PTE. When running as a Xen PV guest this will at once
switch the guest to shadow mode resulting in a loss of performance.

Use arch_atomic64_xchg() instead which will perform the requested
operation atomically with all 64 bits.

Some performance considerations according to:

https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scalable-Processor-throughput-latency.pdf

The main number should be the latency, as there is no tight loop around
native_ptep_get_and_clear().

"lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a
memory operand) isn't mentioned in that document. "lock xadd" (with xadd
having 3 cycles less latency than xchg) has a latency of 11, so we can
assume a latency of 14 for "lock xchg".

Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---
In case adding about 6 cycles for native_ptep_get_and_clear() is believed
to be too bad I can modify the patch to add a paravirt function for that
purpose in order to add the overhead for Xen guests only (in fact the
overhead for Xen guests will be less, as only one instruction writing to
the PTE has to be emulated by the hypervisor).
---
 arch/x86/include/asm/pgtable-3level.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/pgtable-3level.h 
b/arch/x86/include/asm/pgtable-3level.h
index a564084c6141..f8b1ad2c3828 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_X86_PGTABLE_3LEVEL_H
 #define _ASM_X86_PGTABLE_3LEVEL_H
 
+#include <asm/atomic64_32.h>
+
 /*
  * Intel Physical Address Extension (PAE) Mode - three-level page
  * tables on PPro+ CPUs.
@@ -150,10 +152,7 @@ static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
 {
        pte_t res;
 
-       /* xchg acts as a barrier before the setting of the high bits */
-       res.pte_low = xchg(&ptep->pte_low, 0);
-       res.pte_high = ptep->pte_high;
-       ptep->pte_high = 0;
+       res.pte = (pteval_t)arch_atomic64_xchg((atomic64_t *)ptep, 0);
 
        return res;
 }
-- 
2.13.7


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  - From: Thomas Gleixner
- Re: [Xen-devel] [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  - From: Jan Beulich

References:
- [Xen-devel] [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests
  - From: Juergen Gross

Prev by Date: [Xen-devel] [PATCH v2 1/2] x86/xen: don't write ptes directly in 32-bit PV guests
Next by Date: Re: [Xen-devel] [PATCH 34/34] RFC x86: introduce directio virt cap
Previous by thread: [Xen-devel] [PATCH v2 1/2] x86/xen: don't write ptes directly in 32-bit PV guests
Next by thread: Re: [Xen-devel] [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.