[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h



On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
> On 23.02.2024 13:23, Oleksii wrote:
> > > 
> > > > > > As 1- and 2-byte cases are emulated I decided that is not
> > > > > > to
> > > > > > provide
> > > > > > sfx argument for emulation macros as it will not have to
> > > > > > much
> > > > > > affect on
> > > > > > emulated types and just consume more performance on acquire
> > > > > > and
> > > > > > release
> > > > > > version of sc/ld instructions.
> > > > > 
> > > > > Question is whether the common case (4- and 8-byte accesses)
> > > > > shouldn't
> > > > > be valued higher, with 1- and 2-byte emulation being there
> > > > > just
> > > > > to
> > > > > allow things to not break altogether.
> > > > If I understand you correctly, it would make sense to add the
> > > > 'sfx'
> > > > argument for the 1/2-byte access case, ensuring that all
> > > > options
> > > > are
> > > > available for 1/2-byte access case as well.
> > > 
> > > That's one of the possibilities. As said, I'm not overly worried
> > > about
> > > the emulated cases. For the initial implementation I'd recommend
> > > going
> > > with what is easiest there, yielding the best possible result for
> > > the
> > > 4- and 8-byte cases. If later it turns out repeated
> > > acquire/release
> > > accesses are a problem in the emulation loop, things can be
> > > changed
> > > to explicit barriers, without touching the 4- and 8-byte cases.
> > I am confused then a little bit if emulated case is not an issue.
> > 
> > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
> > aqcuire
> > version of xchg barries are used.
> > 
> > The similar is done for cmpxchg.
> > 
> > If something will be needed to change in emulation loop it won't
> > require to change 4- and 8-byte cases.
> 
> I'm afraid I don't understand your reply.
IIUC, emulated cases it is implemented correctly in terms of usage
barriers. And it also OK not to use sfx for lr/sc instructions and use
only barriers.

For 4- and 8-byte cases are used sfx + barrier depending on the
specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
What also looks to me correct. But you suggested to provide the best
possible result for 4- and 8-byte cases. 

So I don't understand what the best possible result is as the current
one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
acquire, release, generic xchg/cmpxchg ) looks correct to me:
xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
barriers.
xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only release
barrier
xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
barrier
xchg_relaxed ->  (..., "", "", "") - no barries, no sfx

The similar for cmpxchg().

~ Oleksii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.