|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 10/20] xen/riscv: introduce atomic.h
On Wed, 2024-03-27 at 08:40 +0100, Jan Beulich wrote:
...
> > > > > > +/* This is required to provide a full barrier on success.
> > > > > > */
> > > > > > +static inline int atomic_add_unless(atomic_t *v, int a,
> > > > > > int u)
> > > > > > +{
> > > > > > + int prev, rc;
> > > > > > +
> > > > > > + asm volatile (
> > > > > > + "0: lr.w %[p], %[c]\n"
> > > > > > + " beq %[p], %[u], 1f\n"
> > > > > > + " add %[rc], %[p], %[a]\n"
> > > > > > + " sc.w.rl %[rc], %[rc], %[c]\n"
> > > > > > + " bnez %[rc], 0b\n"
> > > > > > + RISCV_FULL_BARRIER
> > > > >
> > > > > With this and no .aq on the load, why the .rl on the store?
> > > > It is something that LKMM requires [1].
> > > >
> > > > This is not fully clear to me what is so specific in LKMM, but
> > > > accoring
> > > > to the spec:
> > > > Ordering Annotation Fence-based Equivalent
> > > > l{b|h|w|d|r}.aq l{b|h|w|d|r}; fence r,rw
> > > > l{b|h|w|d|r}.aqrl fence rw,rw; l{b|h|w|d|r}; fence r,rw
> > > > s{b|h|w|d|c}.rl fence rw,w; s{b|h|w|d|c}
> > > > s{b|h|w|d|c}.aqrl fence rw,w; s{b|h|w|d|c}
> > > > amo<op>.aq amo<op>; fence r,rw
> > > > amo<op>.rl fence rw,w; amo<op>
> > > > amo<op>.aqrl fence rw,rw; amo<op>; fence rw,rw
> > > > Table 2.2: Mappings from .aq and/or .rl to fence-based
> > > > equivalents.
> > > > An alternative mapping places a fence rw,rw after the
> > > > existing
> > > > s{b|h|w|d|c} mapping rather than at the front of the
> > > > l{b|h|w|d|r} mapping.
> > >
> > > I'm afraid I can't spot the specific case in this table. None of
> > > the
> > > stores in the right column have a .rl suffix.
> > Yes, it is expected.
> >
> > I am reading this table as (f.e.) amo<op>.rl is an equivalent of
> > fence
> > rw,w; amo<op>. (without .rl)
>
> In which case: How does quoting the table answer my original
> question?
Agree, it is starting to be confusing, so let me give an answer to your
question below.
>
> > > > It is also safe to translate any .aq, .rl, or .aqrl
> > > > annotation
> > > > into
> > > > the fence-based snippets of
> > > > Table 2.2. These can also be used as a legal implementation
> > > > of
> > > > l{b|h|w|d} or s{b|h|w|d} pseu-
> > > > doinstructions for as long as those instructions are not
> > > > added
> > > > to
> > > > the ISA.
> > > >
> > > > So according to the spec, it should be:
> > > > sc.w ...
> > > > RISCV_FULL_BARRIER.
> > > >
> > > > Considering [1] and how this code looks before, it seems to me
> > > > that
> > > > it
> > > > is safe to use lr.w.aq/sc.w.rl here or an fence equivalent.
> > >
> > > Here you say "or". Then why dos the code use sc.?.rl _and_ a
> > > fence?
> > I confused this line with amo<op>.aqrl, so based on the table 2.2
> > above
> > s{b|h|w|d|c}.aqrl is transformed to "fence rw,w; s{b|h|w|d|c}", but
> > Linux kernel decided to strengthen it with full barrier:
> > - "0:\n\t"
> > - "lr.w.aqrl %[p], %[c]\n\t"
> > - "beq %[p], %[u], 1f\n\t"
> > - "add %[rc], %[p], %[a]\n\t"
> > - "sc.w.aqrl %[rc], %[rc], %[c]\n\t"
> > - "bnez %[rc], 0b\n\t"
> > - "1:"
> > + "0: lr.w %[p], %[c]\n"
> > + " beq %[p], %[u], 1f\n"
> > + " add %[rc], %[p], %[a]\n"
> > + " sc.w.rl %[rc], %[rc], %[c]\n"
> > + " bnez %[rc], 0b\n"
> > + " fence rw, rw\n"
> > + "1:\n"
> > As they have the following issue:
> > implementations of atomics such as atomic_cmpxchg() and
> > atomic_add_unless() rely on LR/SC pairs,
> > which do not give full-ordering with .aqrl; for example, current
> > implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
> > below to end up with the state indicated in the "exists" clause.
>
> Is that really "current implementations", not "the abstract model"?
> If so, the use of an explicit fence would be more like a workaround
> (and would hence want commenting to that effect).
>
> In neither case can I see my original question answered: Why the .rl
> on the store when there is a full fence later?
The good explanation for that was provided in the commit addressing a
similar issue for ARM64 [
https://patchwork.kernel.org/project/linux-arm-kernel/patch/1391516953-14541-1-git-send-email-will.deacon@xxxxxxx/
].
The same holds true for RISC-V since ARM also employs WMO.
I would also like to mention another point, as I indicated in another
email thread
[ https://lists.xen.org/archives/html/xen-devel/2024-03/msg01582.html ]
, that now this fence can be omitted and .aqrl can be used instead.
This was confirmed by Dan (the author of the RVWMO spec)
[https://lore.kernel.org/linux-riscv/41e01514-74ca-84f2-f5cc-2645c444fd8e@xxxxxxxxxx/
]
I hope this addresses your original question. Does it?
~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |