|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH v2 00/15] xen/arm: port Linux LL/SC and LSE atomics helpers to Xen
Hi Julien,
Thanks for taking a look at the patches and providing feedback. I've seen your
other comments and will reply to those separately when I get a chance (maybe at
the weekend or over the Christmas break).
RE the differences in ordering semantics between Xen's and Linux's atomics
helpers, please find my notes below.
Thoughts?
Cheers,
Ash.
The tables below use format AAA/BBB/CCC/DDD/EEE, where:
- AAA is the memory barrier before the operation
- BBB is the acquire semantics of the atomic operation
- CCC is the release semantics of the atomic operation
- DDD is whether the asm() block clobbers memory
- EEE is the memory barrier after the operation
For example, ---/---/rel/mem/dmb would mean:
- No memory barrier before the operation
- The atomic does *not* have acquire semantics
- The atomic *does* have release semantics
- The asm() block clobbers memory
- There is a DMB memory barrier after the atomic operation
arm64 LL/SC
===========
Xen Function Xen Linux
Inconsistent
============ === =====
============
atomic_add ---/---/---/---/--- ---/---/---/---/---
---
atomic_add_return ---/---/rel/mem/dmb ---/---/rel/mem/dmb
--- (1)
atomic_sub ---/---/---/---/--- ---/---/---/---/---
---
atomic_sub_return ---/---/rel/mem/dmb ---/---/rel/mem/dmb
--- (1)
atomic_and ---/---/---/---/--- ---/---/---/---/---
---
atomic_cmpxchg dmb/---/---/---/dmb ---/---/rel/mem/---
YES (2)
atomic_xchg ---/---/rel/mem/dmb ---/acq/rel/mem/dmb
YES (3)
(1) It's actually interesting to me that Linux does it this way. As with the
LSE atomics below, I'd have expected acq/rel semantics and ditch the DMB.
Unless I'm missing something where there is a concern around taking an IRQ
between the LDAXR and the STLXR, which can't happen in the LSE atomic case
since it's a single instruction. But the exclusive monitor is cleared on
exception return in AArch64 so I'm struggling to see what that potential
issue may be. Regardless, Linux and Xen are consistent so we're OK ;-)
(2) The Linux version uses either STLXR with rel semantics if the comparison
passes, or DMB if the comparison fails. This is weaker than Xen's version,
which is quite blunt in always wrapping the operation between two DMBs. This
may be a holdover from Xen's arm32 versions being ported to arm64, as we
didn't support acq/rel semantics on LDREX and STREX in Armv7-A? Regardless,
this is quite a big discrepancy and I've not yet given it enough thought to
determine whether it would actually cause an issue. My feeling is that the
Linux LL/SC atomic_cmpxchg() should have have acq semantics on the LL, but
like you said these helpers are well tested so I'd be surprised if there
is a bug. See (5) below though, where the Linux LSE atomic_cmpxchg() *does*
have acq semantics.
(3) The Linux version just adds acq semantics to the LL, so we're OK here.
arm64 LSE (comparison to Xen's LL/SC)
=====================================
Xen Function Xen Linux
Inconsistent
============ === =====
============
atomic_add ---/---/---/---/--- ---/---/---/---/---
---
atomic_add_return ---/---/rel/mem/dmb ---/acq/rel/mem/---
YES (4)
atomic_sub ---/---/---/---/--- ---/---/---/---/---
---
atomic_sub_return ---/---/rel/mem/dmb ---/acq/rel/mem/---
YES (4)
atomic_and ---/---/---/---/--- ---/---/---/---/---
---
atomic_cmpxchg dmb/---/---/---/dmb ---/acq/rel/mem/---
YES (5)
atomic_xchg ---/---/rel/mem/dmb ---/acq/rel/mem/---
YES (4)
(4) As noted in (1), this is how I would have expected Linux's LL/SC atomics to
work too. I don't think this discrepancy will cause any issues.
(5) As with (2) above, this is quite a big discrepancy to Xen. However at least
this version has acq semantics unlike the LL/SC version in (2), so I'm more
confident that there won't be regressions going from Xen LL/SC to Linux LSE
version of atomic_cmpxchg().
arm32 LL/SC
===========
Xen Function Xen Linux
Inconsistent
============ === =====
============
atomic_add ---/---/---/---/--- ---/---/---/---/---
---
atomic_add_return dmb/---/---/---/dmb XXX/XXX/XXX/XXX/XXX
YES (6)
atomic_sub ---/---/---/---/--- ---/---/---/---/---
---
atomic_sub_return dmb/---/---/---/dmb XXX/XXX/XXX/XXX/XXX
YES (6)
atomic_and ---/---/---/---/--- ---/---/---/---/---
---
atomic_cmpxchg dmb/---/---/---/dmb XXX/XXX/XXX/XXX/XXX
YES (6)
atomic_xchg dmb/---/---/---/dmb XXX/XXX/XXX/XXX/XXX
YES (6)
(6) Linux only provides relaxed variants of these functions, such as
atomic_add_return_relaxed() and atomic_xchg_relaxed(). Patches #13 and #14
in the series add the stricter versions expected by Xen, wrapping calls to
Linux's relaxed variants inbetween two calls to smb_mb(). This makes them
consistent with Xen's existing helpers, though is quite blunt. It is worth
noting that Armv8-A AArch32 does support acq/rel semantics on exclusive
accesses, with LDAEX and STLEX, so I could imagine us introducing a new
arm32 hwcap to detect whether we're on actual Armv7-A hardware or Armv8-A
AArch32, then swap to lighterweight STLEX versions of these helpers rather
than the heavyweight double DMB versions. Whether that would actually give
measurable performance improvements is another story!
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |