[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
On Mon, 2024-01-22 at 17:27 +0100, Jan Beulich wrote: > On 22.12.2023 16:12, Oleksii Kurochko wrote: > > --- /dev/null > > +++ b/xen/arch/riscv/include/asm/cmpxchg.h > > @@ -0,0 +1,496 @@ > > +/* SPDX-License-Identifier: GPL-2.0-only */ > > +/* Copyright (C) 2014 Regents of the University of California */ > > + > > +#ifndef _ASM_RISCV_CMPXCHG_H > > +#define _ASM_RISCV_CMPXCHG_H > > + > > +#include <xen/compiler.h> > > +#include <xen/lib.h> > > + > > +#include <asm/fence.h> > > +#include <asm/io.h> > > +#include <asm/system.h> > > + > > +#define __xchg_relaxed(ptr, new, size) \ > > +({ \ > > + __typeof__(ptr) ptr__ = (ptr); \ > > + __typeof__(new) new__ = (new); \ > > + __typeof__(*(ptr)) ret__; \ > > I expect the types of new and *ptr want to actually match. Which > you then want to enforce, so that issues at use sites would either > be reported by the compiler, or be permitted by a type conversion > of new. I am OK to make the same type for new and ret__, but it looks like __xchg_relaxed() is used only inside xchg_relaxed() where 'new' is declared with the same type as *ptr. > > > + switch (size) \ > > + { \ > > Nit: Hard tab left here. (Also again you want to either stick to > Linux style or fully switch to Xen style.) Thanks. I'll update that. > > > + case 4: \ > > + asm volatile( \ > > + " amoswap.w %0, %2, %1\n" \ > > I don't think a leading tab (or leading whitespace in general) is > needed in single-line-output asm()s. The trailing \n also isn't > needed if I'm not mistaken. I just wanted to align it with "=r", but for sure a leading tab or whitespace can be dropped. > > > + : "=r" (ret__), "+A" (*ptr__) \ > > + : "r" (new__) \ > > + : "memory" ); \ > > + break; \ > > + case 8: \ > > + asm volatile( \ > > + " amoswap.d %0, %2, %1\n" \ > > + : "=r" (ret__), "+A" (*ptr__) \ > > + : "r" (new__) \ > > + : "memory" ); \ > > + break; \ > > + default: \ > > + ASSERT_UNREACHABLE(); \ > > If at all possible this wants to trigger a build failure, not a > runtime > one. I'll update that with BUILD_BUG_ON(size > 8); > > > + } \ > > + ret__; \ > > +}) > > + > > +#define xchg_relaxed(ptr, x) \ > > +({ \ > > + __typeof__(*(ptr)) x_ = (x); \ > > + (__typeof__(*(ptr))) __xchg_relaxed((ptr), x_, > > sizeof(*(ptr))); \ > > Nit: Stray blank after cast. For readability I'd also suggest to > drop parentheses in cases like the first argument passed to > __xchg_relaxed() here. Thanks. I'll take that into account. > > > +}) > > For both: What does "relaxed" describe? I'm asking because it's not > really clear whether the memory clobbers are actually needed. > > > +#define __xchg_acquire(ptr, new, size) \ > > +({ \ > > + __typeof__(ptr) ptr__ = (ptr); \ > > + __typeof__(new) new__ = (new); \ > > + __typeof__(*(ptr)) ret__; \ > > + switch (size) \ > > + { \ > > + case 4: \ > > + asm volatile( \ > > + " amoswap.w %0, %2, %1\n" \ > > + RISCV_ACQUIRE_BARRIER \ > > + : "=r" (ret__), "+A" (*ptr__) \ > > + : "r" (new__) \ > > + : "memory" ); \ > > + break; \ > > + case 8: \ > > + asm volatile( \ > > + " amoswap.d %0, %2, %1\n" \ > > + RISCV_ACQUIRE_BARRIER \ > > + : "=r" (ret__), "+A" (*ptr__) \ > > + : "r" (new__) \ > > + : "memory" ); \ > > + break; \ > > + default: \ > > + ASSERT_UNREACHABLE(); \ > > + } \ > > + ret__; \ > > +}) > > If I'm not mistaken this differs from __xchg_relaxed() only in the > use > of RISCV_ACQUIRE_BARRIER, and ... > > > +#define xchg_acquire(ptr, x) \ > > +({ \ > > + __typeof__(*(ptr)) x_ = (x); \ > > + (__typeof__(*(ptr))) __xchg_acquire((ptr), x_, > > sizeof(*(ptr))); \ > > +}) > > + > > +#define __xchg_release(ptr, new, size) \ > > +({ \ > > + __typeof__(ptr) ptr__ = (ptr); \ > > + __typeof__(new) new__ = (new); \ > > + __typeof__(*(ptr)) ret__; \ > > + switch (size) \ > > + { \ > > + case 4: \ > > + asm volatile ( \ > > + RISCV_RELEASE_BARRIER \ > > + " amoswap.w %0, %2, %1\n" \ > > + : "=r" (ret__), "+A" (*ptr__) \ > > + : "r" (new__) \ > > + : "memory"); \ > > + break; \ > > + case 8: \ > > + asm volatile ( \ > > + RISCV_RELEASE_BARRIER \ > > + " amoswap.d %0, %2, %1\n" \ > > + : "=r" (ret__), "+A" (*ptr__) \ > > + : "r" (new__) \ > > + : "memory"); \ > > + break; \ > > + default: \ > > + ASSERT_UNREACHABLE(); \ > > + } \ > > + ret__; \ > > +}) > > this only in the use of RISCV_RELEASE_BARRIER. If so they likely want > folding, to limit redundancy and make eventual updating easier. (Same > for the cmpxchg helper further down, as it seems.) Yes, you are right. The difference is only in RISCV_RELEASE_BARRIER and it is an absence of RISCV_RELEASE_BARRIER is a reason why we have "relaxed" versions. I am not sure that I understand what you mean by folding here. Do you mean that there is no any sense to have to separate macros and it is needed only one with RISCV_RELEASE_BARRIER? > > > +#define xchg_release(ptr, x) \ > > +({ \ > > + __typeof__(*(ptr)) x_ = (x); \ > > + (__typeof__(*(ptr))) __xchg_release((ptr), x_, > > sizeof(*(ptr))); \ > > +}) > > + > > +static always_inline uint32_t __xchg_case_4(volatile uint32_t > > *ptr, > > + uint32_t new) > > +{ > > + __typeof__(*(ptr)) ret; > > + > > + asm volatile ( > > + " amoswap.w.aqrl %0, %2, %1\n" > > + : "=r" (ret), "+A" (*ptr) > > + : "r" (new) > > + : "memory" ); > > + > > + return ret; > > +} > > + > > +static always_inline uint64_t __xchg_case_8(volatile uint64_t > > *ptr, > > + uint64_t new) > > +{ > > + __typeof__(*(ptr)) ret; > > + > > + asm volatile( \ > > + " amoswap.d.aqrl %0, %2, %1\n" \ > > + : "=r" (ret), "+A" (*ptr) \ > > + : "r" (new) \ > > + : "memory" ); \ > > + > > + return ret; > > +} > > + > > +static always_inline unsigned short __cmpxchg_case_2(volatile > > uint32_t *ptr, > > + uint32_t old, > > + uint32_t > > new); > > Don't you consistently mean uint16_t here (incl the return type) and > ... > > > +static always_inline unsigned short __cmpxchg_case_1(volatile > > uint32_t *ptr, > > + uint32_t old, > > + uint32_t > > new); > > ... uint8_t here? The idea was that we emulate __cmpxchg_case_1 and __cmpxchg_case_2 using 4 bytes cmpxchg and __cmpxchg_case_4 expects uint32_t. > > > +static inline unsigned long __xchg(volatile void *ptr, unsigned > > long x, int size) > > +{ > > + switch (size) { > > + case 1: > > + return __cmpxchg_case_1(ptr, (uint32_t)-1, x); > > + case 2: > > + return __cmpxchg_case_2(ptr, (uint32_t)-1, x); > > How are these going to work? You'll compare against ~0, and if the > value > in memory isn't ~0, memory won't be updated; you will only > (correctly) > return the value found in memory. > > Or wait - looking at __cmpxchg_case_{1,2}() far further down, you > ignore > "old" there. Which apparently means they'll work for the use here, > but > not for the use in __cmpxchg(). Yes, the trick is that old is ignored and is read in __emulate_cmpxchg_case1_2() before __cmpxchg_case_4 is called: do { read_val = read_func(aligned_ptr); swapped_new = read_val & ~mask; swapped_new |= masked_new; ret = cmpxchg_func(aligned_ptr, read_val, swapped_new); } while ( ret != read_val ); read_val it is 'old'. But now I am not 100% sure that it is correct for __cmpxchg... > > > + case 4: > > + return __xchg_case_4(ptr, x); > > + case 8: > > + return __xchg_case_8(ptr, x); > > + default: > > + ASSERT_UNREACHABLE(); > > + } > > + > > + return -1; > > +} > > + > > +#define xchg(ptr,x) \ > > +({ \ > > + __typeof__(*(ptr)) ret__; \ > > + ret__ = (__typeof__(*(ptr))) \ > > + __xchg((ptr), (unsigned long)(x), sizeof(*(ptr))); \ > > + ret__; \ > > +}) > > + > > +#define xchg32(ptr, x) \ > > +({ \ > > + BUILD_BUG_ON(sizeof(*(ptr)) != 4); \ > > + xchg((ptr), (x)); \ > > +}) > > + > > +#define xchg64(ptr, x) \ > > +({ \ > > + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ > > + xchg((ptr), (x)); \ > > +}) > > What are these two (and their cmpxchg counterparts) needed for? It was copied from Linux, but it looks like they can be really dropped, I can't find where it is used, so I'll drop them. > > > +/* > > + * Atomic compare and exchange. Compare OLD with MEM, if > > identical, > > + * store NEW in MEM. Return the initial value in MEM. Success is > > + * indicated by comparing RETURN with OLD. > > + */ > > +#define __cmpxchg_relaxed(ptr, old, new, size) \ > > +({ \ > > + __typeof__(ptr) ptr__ = (ptr); \ > > + __typeof__(*(ptr)) __old = (old); \ > > Leftover leading underscores? > > > + __typeof__(*(ptr)) new__ = (new); \ > > Related to my earlier comment on types needing to be compatible - see > how here you're using "ptr" throughout. > > > + __typeof__(*(ptr)) ret__; \ > > + register unsigned int __rc; \ > > More leftover leading underscores? Missed to drop underscores. Thanks. I'll drop them in next version. > > > +static always_inline unsigned short __cmpxchg_case_2(volatile > > uint32_t *ptr, > > + uint32_t old, > > + uint32_t new) > > +{ > > + (void) old; > > + > > + if (((unsigned long)ptr & 3) == 3) > > + { > > +#ifdef CONFIG_64BIT > > + return __emulate_cmpxchg_case1_2((uint64_t *)ptr, new, > > + readq, __cmpxchg_case_8, > > 0xffffU); > > What if ((unsigned long)ptr & 7) == 7 (which is a sub-case of what > the > if() above checks for? Isn't it more reasonable to require aligned > 16-bit quantities here? Or if mis-aligned addresses are okay, you > could > as well emulate using __cmpxchg_case_4(). Yes, it will be more reasonable. I'll use IS_ALIGNED instead. > > Also you shouldn't be casting away volatile (here and below). > Avoiding > the casts (by suitable using volatile void * parameter types) would > likely be best. Thanks. I'll update the function prototype. ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |