[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 10/34] xen/riscv: introduce bitops.h



On Mon, 2024-01-15 at 17:44 +0100, Jan Beulich wrote:
> On 22.12.2023 16:12, Oleksii Kurochko wrote:
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/bitops.h
> > @@ -0,0 +1,267 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/* Copyright (C) 2012 Regents of the University of California */
> > +
> > +#ifndef _ASM_RISCV_BITOPS_H
> > +#define _ASM_RISCV_BITOPS_H
> > +
> > +#include <asm/system.h>
> > +
> > +#include <asm-generic/bitops/bitops-bits.h>
> > +
> > +/* Based on linux/arch/include/linux/bits.h */
> > +
> > +#define BIT_MASK(nr)        (1UL << ((nr) % BITS_PER_LONG))
> > +#define BIT_WORD(nr)        ((nr) / BITS_PER_LONG)
> > +
> > +#define __set_bit(n,p)      set_bit(n,p)
> > +#define __clear_bit(n,p)    clear_bit(n,p)
> > +
> > +/* Based on linux/arch/include/asm/bitops.h */
> > +
> > +#if ( BITS_PER_LONG == 64 )
> > +#define __AMO(op)   "amo" #op ".d"
> > +#elif ( BITS_PER_LONG == 32 )
> > +#define __AMO(op)   "amo" #op ".w"
> > +#else
> > +#error "Unexpected BITS_PER_LONG"
> > +#endif
> > +
> > +#define __test_and_op_bit_ord(op, mod, nr, addr, ord)              \
> > +({                                                         \
> > +    unsigned long __res, __mask;                           \
> > +    __mask = BIT_MASK(nr);                                 \
> > +    __asm__ __volatile__ (                                 \
> > +        __AMO(op) #ord " %0, %2, %1"                       \
> > +        : "=r" (__res), "+A" (addr[BIT_WORD(nr)])  \
> > +        : "r" (mod(__mask))                                \
> > +        : "memory");                                       \
> > +    ((__res & __mask) != 0);                               \
> > +})
> 
> Despite the taking from Linux I think the overall result wants to be
> consistent formatting-wise: You switched to blank indentation (which
> is fine), but you left tabs as padding for the line continuation
> characters.
I think it is because of my IDE. I will be consistent in next patch
version.

Thanks.

> 
> > +#define __op_bit_ord(op, mod, nr, addr, ord)                       \
> > +    __asm__ __volatile__ (                                 \
> > +        __AMO(op) #ord " zero, %1, %0"                     \
> > +        : "+A" (addr[BIT_WORD(nr)])                        \
> > +        : "r" (mod(BIT_MASK(nr)))                  \
> > +        : "memory");
> > +
> > +#define __test_and_op_bit(op, mod, nr, addr)                       \
> 
> (At least) here you even use a mix.
> 
> > +    __test_and_op_bit_ord(op, mod, nr, addr, .aqrl)
> > +#define __op_bit(op, mod, nr, addr)                                \
> > +    __op_bit_ord(op, mod, nr, addr, )
> > +
> > +/* Bitmask modifiers */
> > +#define __NOP(x)   (x)
> > +#define __NOT(x)   (~(x))
> > +
> > +/**
> > + * __test_and_set_bit - Set a bit and return its old value
> > + * @nr: Bit to set
> > + * @addr: Address to count from
> > + *
> > + * This operation may be reordered on other architectures than
> > x86.
> > + */
> > +static inline int __test_and_set_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> > +
> > +    return __test_and_op_bit(or, __NOP, nr, addr);
> > +}
> > +
> > +/**
> > + * __test_and_clear_bit - Clear a bit and return its old value
> > + * @nr: Bit to clear
> > + * @addr: Address to count from
> > + *
> > + * This operation can be reordered on other architectures other
> > than x86.
> > + */
> > +static inline int __test_and_clear_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> > +
> > +    return __test_and_op_bit(and, __NOT, nr, addr);
> > +}
> > +
> > +/**
> > + * set_bit - Atomically set a bit in memory
> > + * @nr: the bit to set
> > + * @addr: the address to start counting from
> > + *
> > + * Note: there are no guarantees that this function will not be
> > reordered
> > + * on non x86 architectures, so if you are writing portable code,
> > + * make sure not to rely on its reordering guarantees.
> > + *
> > + * Note that @nr may be almost arbitrarily large; this function is
> > not
> > + * restricted to acting on a single-word quantity.
> > + */
> > +static inline void set_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> > +
> > +    __op_bit(or, __NOP, nr, addr);
> > +}
> > +
> > +/**
> > + * clear_bit - Clears a bit in memory
> > + * @nr: Bit to clear
> > + * @addr: Address to start counting from
> > + *
> > + * Note: there are no guarantees that this function will not be
> > reordered
> > + * on non x86 architectures, so if you are writing portable code,
> > + * make sure not to rely on its reordering guarantees.
> > + */
> > +static inline void clear_bit(int nr, volatile void *p)
> > +{
> > +    volatile uint32_t *addr = p;
> > +
> > +    __op_bit(and, __NOT, nr, addr);
> > +}
> > +
> > +#undef __test_and_op_bit
> > +#undef __op_bit
> > +#undef __NOP
> > +#undef __NOT
> > +#undef __AMO
> > +
> > +#define test_and_set_bit   __test_and_set_bit
> > +#define test_and_clear_bit __test_and_clear_bit
> 
> I realize test-and-change have no present users, despite being made
> available by Arm and x86, but I think they would better be provided
> right away, rather than someone introducing a use then needing to
> fiddle with RISC-V (and apparently also PPC) code.
Sure, it makes sense. I'll add test-and-change too.

> 
> I'm also puzzled by this aliasing: Aren't there cheaper non-atomic
> insn forms that could be used for the double-underscore-prefixed
> variants?
It will be cheaper, but I assume that this API should be safe in the
case of SMP where different CPUs can access the same variable or
similar cases with simultaneous access to the variable.

> 
> > +/* Based on linux/include/asm-generic/bitops/find.h */
> > +
> > +#ifndef find_next_bit
> 
> What is this to guard against?
This was copied from Linux, but in case of Xen it can be dropped.

> 
> > +/**
> > + * find_next_bit - find the next set bit in a memory region
> > + * @addr: The address to base the search on
> > + * @offset: The bitnumber to start searching at
> > + * @size: The bitmap size in bits
> > + */
> > +extern unsigned long find_next_bit(const unsigned long *addr,
> > unsigned long
> > +        size, unsigned long offset);
> > +#endif
> > +
> > +#ifndef find_next_zero_bit
> > +/**
> > + * find_next_zero_bit - find the next cleared bit in a memory
> > region
> > + * @addr: The address to base the search on
> > + * @offset: The bitnumber to start searching at
> > + * @size: The bitmap size in bits
> > + */
> > +extern unsigned long find_next_zero_bit(const unsigned long *addr,
> > unsigned
> > +        long size, unsigned long offset);
> > +#endif
> > +
> > +/**
> > + * find_first_bit - find the first set bit in a memory region
> > + * @addr: The address to start the search at
> > + * @size: The maximum size to search
> > + *
> > + * Returns the bit number of the first set bit.
> > + */
> > +extern unsigned long find_first_bit(const unsigned long *addr,
> > +                    unsigned long size);
> > +
> > +/**
> > + * find_first_zero_bit - find the first cleared bit in a memory
> > region
> > + * @addr: The address to start the search at
> > + * @size: The maximum size to search
> > + *
> > + * Returns the bit number of the first cleared bit.
> > + */
> > +extern unsigned long find_first_zero_bit(const unsigned long
> > *addr,
> > +                     unsigned long size);
> > +
> > +/**
> > + * ffs - find first bit in word.
> > + * @word: The word to search
> > + *
> > + * Returns 0 if no bit exists, otherwise returns 1-indexed bit
> > location.
> > + */
> > +static inline unsigned long __ffs(unsigned long word)
> > +{
> > +    int num = 0;
> 
> I understand it's this way in Linux, so there may be reasons to keep
> it
> like that. Generally though we'd prefer unsigned here, and the type
> of
> a variable used for the return value of a function would also better
> be
> in sync with the function's return type (which I don't think needs to
> be "unsigned long" either; "unsigned int" would apparently suffice).
> 
> > +#if BITS_PER_LONG == 64
> > +    if ((word & 0xffffffff) == 0) {
> > +        num += 32;
> > +        word >>= 32;
> > +    }
> 
> You're ending up with neither Xen nor Linux style this way. May I
> suggest to settle on either?

Will it fine to rework header from Linux to Xen style? Does it make
sense?
I think this file can be reworked to Xen style as I don't expect that
it will be changed since it will be merged.

> 
> > +#endif
> > +    if ((word & 0xffff) == 0) {
> > +        num += 16;
> > +        word >>= 16;
> > +    }
> > +    if ((word & 0xff) == 0) {
> > +        num += 8;
> > +        word >>= 8;
> > +    }
> > +    if ((word & 0xf) == 0) {
> > +        num += 4;
> > +        word >>= 4;
> > +    }
> > +    if ((word & 0x3) == 0) {
> > +        num += 2;
> > +        word >>= 2;
> > +    }
> > +    if ((word & 0x1) == 0)
> > +        num += 1;
> > +    return num;
> > +}
> > +
> > +/**
> > + * ffsl - find first bit in long.
> > + * @word: The word to search
> > + *
> > + * Returns 0 if no bit exists, otherwise returns 1-indexed bit
> > location.
> > + */
> > +static inline unsigned int ffsl(unsigned long word)
> > +{
> > +    int num = 1;
> > +
> > +    if (!word)
> > +        return 0;
> > +
> > +#if BITS_PER_LONG == 64
> > +    if ((word & 0xffffffff) == 0) {
> > +        num += 32;
> > +        word >>= 32;
> > +    }
> > +#endif
> > +    if ((word & 0xffff) == 0) {
> > +        num += 16;
> > +        word >>= 16;
> > +    }
> > +    if ((word & 0xff) == 0) {
> > +        num += 8;
> > +        word >>= 8;
> > +    }
> > +    if ((word & 0xf) == 0) {
> > +        num += 4;
> > +        word >>= 4;
> > +    }
> > +    if ((word & 0x3) == 0) {
> > +        num += 2;
> > +        word >>= 2;
> > +    }
> > +    if ((word & 0x1) == 0)
> > +        num += 1;
> > +    return num;
> > +}
> 
> What's RISC-V-specific about the above? IOW why ...
> 
> > +#include <asm-generic/bitops/fls.h>
> > +#include <asm-generic/bitops/flsl.h>
> > +#include <asm-generic/bitops/ffs.h>
> > +#include <asm-generic/bitops/ffz.h>
> 
> ... can't those two also come from respective generic headers?
Sure it can. Originally, I don't introduce it as generic headers
because RISC-V is only one who isn using C generic version.

I found that there is already present such generic function in
xen/bitops.h. I think it makes sense to re-use them.

> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/bitops-bits.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_BITS_H_
> > +#define _ASM_GENERIC_BITOPS_BITS_H_
> > +
> > +#define BITOP_BITS_PER_WORD     32
> > +#define BITOP_MASK(nr)          (1UL << ((nr) %
> > BITOP_BITS_PER_WORD))
> 
> Why 1UL and not just 1U, when bits per word is 32?
There is no specific reason, should 1U. ( I originally used
BITOPS_BITS_PER_LONG ) and with introduction of asm-generic bitops
decided to follow what other archs provide.

Regarding to the second part of the question, I don't understand it
fully. Considering BITOP_BIT_PER_WORD definition for other archs ( ARM
and PPC ) it is expected that word is 32 bits.

> 
> > +#define BITOP_WORD(nr)          ((nr) / BITOP_BITS_PER_WORD)
> > +#define BITS_PER_BYTE           8
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_BITS_H_ */
> > \ No newline at end of file
> 
> Nit: I did comment on such before (and there's at least one more
> further down).
Thanks.
I'll add a newline.

> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/ffs.h
> > @@ -0,0 +1,9 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_FFS_H_
> > +#define _ASM_GENERIC_BITOPS_FFS_H_
> > +
> > +#include <xen/macros.h>
> > +
> > +#define ffs(x) ({ unsigned int t_ = (x); fls(ISOLATE_LSB(t_)); })
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_FFS_H_ */
> > diff --git a/xen/include/asm-generic/bitops/ffz.h
> > b/xen/include/asm-generic/bitops/ffz.h
> > new file mode 100644
> > index 0000000000..92c35455d5
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/ffz.h
> > @@ -0,0 +1,13 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_FFZ_H_
> > +#define _ASM_GENERIC_BITOPS_FFZ_H_
> > +
> > +/*
> > + * ffz - find first zero in word.
> > + * @word: The word to search
> > + *
> > + * Undefined if no zero exists, so code should check against ~0UL
> > first.
> > + */
> > +#define ffz(x)  __ffs(~(x))
> 
> For a generic header I'd like to see consistency: ffz() may expand to
> ffs(), and __ffz() may expand to __ffs(). A mix like you have it
> above
> wants at least explaining in the description. (I don't understand
> anyway why both ffs() and __ffs() would be needed. We don't have any
> __ffs() on x86 afaics.)
Then it looks to me that ffs() should be used instead.
ffz() was defined with __ffs() because Arm and PPC are defined so.
> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/find-first-bit-set.h
> 
> This file name, if it really needs to be this long, wants to match
> ...
> 
> > @@ -0,0 +1,17 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_FIND_FIRST_BIT_SET_H_
> > +#define _ASM_GENERIC_BITOPS_FIND_FIRST_BIT_SET_H_
> > +
> > +/**
> > + * find_first_set_bit - find the first set bit in @word
> > + * @word: the word to search
> > + *
> > + * Returns the bit-number of the first set bit (first bit being
> > 0).
> > + * The input must *not* be zero.
> > + */
> > +static inline unsigned int find_first_set_bit(unsigned long word)
> 
> ... the function implemented herein.
Thanks. I'll update the file name.
> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/fls.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_FLS_H_
> > +#define _ASM_GENERIC_BITOPS_FLS_H_
> > +
> > +/**
> > + * fls - find last (most-significant) bit set
> > + * @x: the word to search
> > + *
> > + * This is defined the same way as ffs.
> > + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
> > + */
> > +
> > +static inline int fls(unsigned int x)
> > +{
> > +    return generic_fls(x);
> > +}
> 
> Seing this, why would e.g. ffsl() not use generic_ffsl() then?
> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/hweight.h
> > @@ -0,0 +1,13 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_HWEIGHT_H_
> > +#define _ASM_GENERIC_BITOPS_HWEIGHT_H_
> > +
> > +/*
> > + * hweightN - returns the hamming weight of a N-bit word
> > + * @x: the word to weigh
> > + *
> > + * The Hamming Weight of a number is the total number of bits set
> > in it.
> > + */
> > +#define hweight64(x) generic_hweight64(x)
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */
> 
> Do we really need this? An arch not having suitable insns (RISC-V
> has,
> iirc) can easily have this one #define (or the ip to four ones when
> also covering the other widths) in its asm header.
> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/bitops/test-bit.h
> > @@ -0,0 +1,16 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_BITOPS_TESTBIT_H_
> > +#define _ASM_GENERIC_BITOPS_TESTBIT_H_
> > +
> > +/**
> > + * test_bit - Determine whether a bit is set
> > + * @nr: bit number to test
> > + * @addr: Address to start counting from
> > + */
> > +static inline int test_bit(int nr, const volatile void *addr)
> > +{
> > +    const volatile unsigned int *p = addr;
> 
> With BITOP_BITS_PER_WORD I think you really mean uint32_t here.
Isn't it the same: 'unsigned int' and 'uint32_t'?

> Also you want to make sure asm-generic/bitops/bitops-bits.h is
> really in use here, or else an arch overriding / not using that
> header may end up screwed.
I am not really understand what do you mean. Could you please explain a
little bit more.

~ Oleksii
> 
> Jan
> 
> > +    return 1 & (p[BITOP_WORD(nr)] >> (nr & (BITOP_BITS_PER_WORD -
> > 1)));
> > +}
> > +
> > +#endif /* _ASM_GENERIC_BITOPS_TESTBIT_H_ */
> > \ No newline at end of file
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.