[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
Using only 32-bit writes for the pte will result in an intermediate L1TF vulnerable PTE. When running as a Xen PV guest this will at once switch the guest to shadow mode resulting in a loss of performance. Use arch_atomic64_xchg() instead which will perform the requested operation atomically with all 64 bits. Some performance considerations according to: https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scalable-Processor-throughput-latency.pdf The main number should be the latency, as there is no tight loop around native_ptep_get_and_clear(). "lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a memory operand) isn't mentioned in that document. "lock xadd" (with xadd having 3 cycles less latency than xchg) has a latency of 11, so we can assume a latency of 14 for "lock xchg". Signed-off-by: Juergen Gross <jgross@xxxxxxxx> --- In case adding about 6 cycles for native_ptep_get_and_clear() is believed to be too bad I can modify the patch to add a paravirt function for that purpose in order to add the overhead for Xen guests only (in fact the overhead for Xen guests will be less, as only one instruction writing to the PTE has to be emulated by the hypervisor). --- arch/x86/include/asm/pgtable-3level.h | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h index a564084c6141..7919ae4e27d8 100644 --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -2,6 +2,8 @@ #ifndef _ASM_X86_PGTABLE_3LEVEL_H #define _ASM_X86_PGTABLE_3LEVEL_H +#include <asm/atomic64_32.h> + /* * Intel Physical Address Extension (PAE) Mode - three-level page * tables on PPro+ CPUs. @@ -148,14 +150,14 @@ static inline void pud_clear(pud_t *pudp) #ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *ptep) { - pte_t res; + union { + pte_t pte; + long long val; + } res; - /* xchg acts as a barrier before the setting of the high bits */ - res.pte_low = xchg(&ptep->pte_low, 0); - res.pte_high = ptep->pte_high; - ptep->pte_high = 0; + res.val = arch_atomic64_xchg((atomic64_t *)ptep, 0); - return res; + return res.pte; } #else #define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) -- 2.13.7 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |