[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[xen master] x86: Use LOCK ADD instead of MFENCE for smp_mb()



commit de16a8fa0db7f1879442cf9cfe865eb2e9d98e6d
Author:     Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
AuthorDate: Mon Sep 21 13:17:30 2020 +0100
Commit:     Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
CommitDate: Thu Oct 1 11:14:22 2020 +0100

    x86: Use LOCK ADD instead of MFENCE for smp_mb()
    
    MFENCE is overly heavyweight for SMP semantics on WB memory, because it also
    orders weaker cached writes, and flushes the WC buffers.
    
    This technique was used as an optimisation in Java[1], and later adopted by
    Linux[2] where it was measured to have a 60% performance improvement in 
VirtIO
    benchmarks.
    
    The stack is used because it is hot in the L1 cache, and a -4 offset is used
    to avoid creating a false data dependency on live data.
    
    For 64bit userspace, the Red Zone needs to be considered.  Use -32 to allow
    for a reasonable quantity of Red Zone data, but still have a 50% chance of
    hitting the same cache line as %rsp.
    
    Fix up the 32 bit definitions in HVMLoader and libxc to avoid a false data
    dependency.
    
    [1] https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
    [2] 
https://git.kernel.org/torvalds/c/450cbdd0125cfa5d7bbf9e2a6b6961cc48d29730
    
    Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
    Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
    Acked-by: Wei Liu <wl@xxxxxxx>
---
 tools/firmware/hvmloader/util.h   | 2 +-
 tools/libs/ctrl/include/xenctrl.h | 4 ++--
 xen/include/asm-x86/system.h      | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h
index 31889de634..4f0baade0e 100644
--- a/tools/firmware/hvmloader/util.h
+++ b/tools/firmware/hvmloader/util.h
@@ -133,7 +133,7 @@ static inline void cpu_relax(void)
 #define barrier() asm volatile ( "" : : : "memory" )
 #define rmb()     barrier()
 #define wmb()     barrier()
-#define mb()      asm volatile ( "lock; addl $0,0(%%esp)" : : : "memory" )
+#define mb()      asm volatile ( "lock addl $0, -4(%%esp)" ::: "memory" )
 
 /*
  * Divide a 64-bit dividend by a 32-bit divisor.
diff --git a/tools/libs/ctrl/include/xenctrl.h 
b/tools/libs/ctrl/include/xenctrl.h
index ba70bec9c4..3796425e1e 100644
--- a/tools/libs/ctrl/include/xenctrl.h
+++ b/tools/libs/ctrl/include/xenctrl.h
@@ -68,11 +68,11 @@
 #define xen_barrier() asm volatile ( "" : : : "memory")
 
 #if defined(__i386__)
-#define xen_mb()  asm volatile ( "lock; addl $0,0(%%esp)" : : : "memory" )
+#define xen_mb()  asm volatile ( "lock addl $0, -4(%%esp)" ::: "memory" )
 #define xen_rmb() xen_barrier()
 #define xen_wmb() xen_barrier()
 #elif defined(__x86_64__)
-#define xen_mb()  asm volatile ( "mfence" : : : "memory")
+#define xen_mb()  asm volatile ( "lock addl $0, -32(%%rsp)" ::: "memory" )
 #define xen_rmb() xen_barrier()
 #define xen_wmb() xen_barrier()
 #elif defined(__arm__)
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index 45c183bd10..630909965e 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -220,7 +220,7 @@ static always_inline unsigned long __xadd(
  *
  * Refer to the vendor system programming manuals for further details.
  */
-#define smp_mb()        mb()
+#define smp_mb()        asm volatile ( "lock addl $0, -4(%%rsp)" ::: "memory" )
 #define smp_rmb()       barrier()
 #define smp_wmb()       barrier()
 
--
generated by git-patchbot for /home/xen/git/xen.git#master



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.