[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2 2/3] locking/x86: Introduce arch_sync_try_cmpxchg



Introduce arch_sync_try_cmpxchg macro to improve code using
sync_try_cmpxchg locking primitive. The new definitions use existing
__raw_try_cmpxchg macros, but use its own "lock; " prefix.

The new macros improve assembly of the cmpxchg loop in
evtchn_fifo_unmask() from drivers/xen/events/events_fifo.c from:

 57a:   85 c0                   test   %eax,%eax
 57c:   78 52                   js     5d0 <...>
 57e:   89 c1                   mov    %eax,%ecx
 580:   25 ff ff ff af          and    $0xafffffff,%eax
 585:   c7 04 24 00 00 00 00    movl   $0x0,(%rsp)
 58c:   81 e1 ff ff ff ef       and    $0xefffffff,%ecx
 592:   89 4c 24 04             mov    %ecx,0x4(%rsp)
 596:   89 44 24 08             mov    %eax,0x8(%rsp)
 59a:   8b 74 24 08             mov    0x8(%rsp),%esi
 59e:   8b 44 24 04             mov    0x4(%rsp),%eax
 5a2:   f0 0f b1 32             lock cmpxchg %esi,(%rdx)
 5a6:   89 04 24                mov    %eax,(%rsp)
 5a9:   8b 04 24                mov    (%rsp),%eax
 5ac:   39 c1                   cmp    %eax,%ecx
 5ae:   74 07                   je     5b7 <...>
 5b0:   a9 00 00 00 40          test   $0x40000000,%eax
 5b5:   75 c3                   jne    57a <...>
 <...>

to:

 578:   a9 00 00 00 40          test   $0x40000000,%eax
 57d:   74 2b                   je     5aa <...>
 57f:   85 c0                   test   %eax,%eax
 581:   78 40                   js     5c3 <...>
 583:   89 c1                   mov    %eax,%ecx
 585:   25 ff ff ff af          and    $0xafffffff,%eax
 58a:   81 e1 ff ff ff ef       and    $0xefffffff,%ecx
 590:   89 4c 24 04             mov    %ecx,0x4(%rsp)
 594:   89 44 24 08             mov    %eax,0x8(%rsp)
 598:   8b 4c 24 08             mov    0x8(%rsp),%ecx
 59c:   8b 44 24 04             mov    0x4(%rsp),%eax
 5a0:   f0 0f b1 0a             lock cmpxchg %ecx,(%rdx)
 5a4:   89 44 24 04             mov    %eax,0x4(%rsp)
 5a8:   75 30                   jne    5da <...>
 <...>
 5da:   8b 44 24 04             mov    0x4(%rsp),%eax
 5de:   eb 98                   jmp    578 <...>

The new code removes move instructions from 585: 5a6: and 5a9:
and the compare from 5ac:. Additionally, the compiler assumes that
cmpxchg success is more probable and optimizes code flow accordingly.

Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Signed-off-by: Uros Bizjak <ubizjak@xxxxxxxxx>
---
v2: Improve commit description.
---
 arch/x86/include/asm/cmpxchg.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index d53636506134..5612648b0202 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -221,12 +221,18 @@ extern void __add_wrong_size(void)
 #define __try_cmpxchg(ptr, pold, new, size)                            \
        __raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
 
+#define __sync_try_cmpxchg(ptr, pold, new, size)                       \
+       __raw_try_cmpxchg((ptr), (pold), (new), (size), "lock; ")
+
 #define __try_cmpxchg_local(ptr, pold, new, size)                      \
        __raw_try_cmpxchg((ptr), (pold), (new), (size), "")
 
 #define arch_try_cmpxchg(ptr, pold, new)                               \
        __try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
 
+#define arch_sync_try_cmpxchg(ptr, pold, new)                          \
+       __sync_try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
+
 #define arch_try_cmpxchg_local(ptr, pold, new)                         \
        __try_cmpxchg_local((ptr), (pold), (new), sizeof(*(ptr)))
 
-- 
2.41.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.