Xen project Mailing List

Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h

On Sun, 2024-02-18 at 19:00 +0000, Julien Grall wrote: > > > On 05/02/2024 15:32, Oleksii Kurochko wrote: > > The header was taken from Linux kernl 6.4.0-rc1. > > > > Addionally, were updated: > > * add emulation of {cmp}xchg for 1/2 byte types > > This explaination is a little bit light. IIUC, you are implementing > them > using 32-bit atomic access. Is that correct? If so, please spell it > out. Sure, I'll update commit message. > > Also, I wonder whether it would be better to try to get rid of the > 1/2 > bytes access. Do you know where they are used? Right now, the issue is with test_and_clear_bool() which is used in common/sched/core.c:840 [https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/sched/core.c?ref_type=heads#L840 ] I don't remember details, but in xen-devel chat someone told me that grant table requires 1/2 bytes access. > > > * replace tabs with spaces > Does this mean you are not planning to backport any Linux fixes? If it will be any fixes for sure I'll back port them, but it looks like this code is stable enough and not to many fixes will be done there, so it shouldn't be hard to backport them and switch to spaces. > > > * replace __* varialbed with *__ > > s/varialbed/variable/ > > > * introduce generic version of xchg_* and cmpxchg_*. > > > > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx> > > --- > > Changes in V4: > > - Code style fixes. > > - enforce in __xchg_*() has the same type for new and *ptr, also > > "\n" > > was removed at the end of asm instruction. > > - dependency from > > https://lore.kernel.org/xen-devel/cover.1706259490.git.federico.serafini@xxxxxxxxxxx/ > > - switch from ASSERT_UNREACHABLE to STATIC_ASSERT_UNREACHABLE(). > > - drop xchg32(ptr, x) and xchg64(ptr, x) as they aren't used. > > - drop cmpxcg{32,64}_{local} as they aren't used. > > - introduce generic version of xchg_* and cmpxchg_*. > > - update the commit message. > > --- > > Changes in V3: > > - update the commit message > > - add emulation of {cmp}xchg_... for 1 and 2 bytes types > > --- > > Changes in V2: > > - update the comment at the top of the header. > > - change xen/lib.h to xen/bug.h. > > - sort inclusion of headers properly. > > --- > > xen/arch/riscv/include/asm/cmpxchg.h | 237 > > +++++++++++++++++++++++++++ > > 1 file changed, 237 insertions(+) > > create mode 100644 xen/arch/riscv/include/asm/cmpxchg.h > > > > diff --git a/xen/arch/riscv/include/asm/cmpxchg.h > > b/xen/arch/riscv/include/asm/cmpxchg.h > > new file mode 100644 > > index 0000000000..b751a50cbf > > --- /dev/null > > +++ b/xen/arch/riscv/include/asm/cmpxchg.h > > @@ -0,0 +1,237 @@ > > +/* SPDX-License-Identifier: GPL-2.0-only */ > > +/* Copyright (C) 2014 Regents of the University of California */ > > + > > +#ifndef _ASM_RISCV_CMPXCHG_H > > +#define _ASM_RISCV_CMPXCHG_H > > + > > +#include <xen/compiler.h> > > +#include <xen/lib.h> > > + > > +#include <asm/fence.h> > > +#include <asm/io.h> > > +#include <asm/system.h> > > + > > +#define ALIGN_DOWN(addr, size) ((addr) & (~((size) - 1))) > > + > > +#define __amoswap_generic(ptr, new, ret, sfx, release_barrier, > > acquire_barrier) \ > > +({ \ > > + asm volatile( \ > > + release_barrier \ > > + " amoswap" sfx " %0, %2, %1\n" \ > > + acquire_barrier \ > > + : "=r" (ret), "+A" (*ptr) \ > > + : "r" (new) \ > > + : "memory" ); \ > > +}) > > + > > +#define emulate_xchg_1_2(ptr, new, ret, release_barrier, > > acquire_barrier) \ > > +({ \ > > + uint32_t *ptr_32b_aligned = (uint32_t *)ALIGN_DOWN((unsigned > > long)ptr, 4); \ > > + uint8_t mask_l = ((unsigned long)(ptr) & (0x8 - sizeof(*ptr))) > > * BITS_PER_BYTE; \ > > + uint8_t mask_size = sizeof(*ptr) * BITS_PER_BYTE; \ > > + uint8_t mask_h = mask_l + mask_size - 1; \ > > + unsigned long mask = GENMASK(mask_h, mask_l); \ > > + unsigned long new_ = (unsigned long)(new) << mask_l; \ > > + unsigned long ret_; \ > > + unsigned long rc; \ > > + \ > > + asm volatile( \ > > + release_barrier \ > > + "0: lr.d %0, %2\n" \ > > I was going to ask why this is lr.d rather than lr.w. But I see Jan > already asked. I agree with him that it should probably be a lr.w and > ... > > > + " and %1, %0, %z4\n" \ > > + " or %1, %1, %z3\n" \ > > + " sc.d %1, %1, %2\n" \ > > ... respectively sc.w. The same applies for cmpxchg. I agree that it would be better, and my initial attempt was to handle 4-byte or 8-byte boundary crossing during 2-byte access: 0 1 2 3 4 5 6 7 8 X X X 1 1 X X X X In this case, if I align address 3 to address 0 and then read 4 bytes instead of 8 bytes, I will not process the byte at address 4. This was the reason why I started to read 8 bytes. I also acknowledge that there could be an issue in the following case: X 4094 4095 4096 X 1 1 X In this situation, when I read 8 bytes, there could be an issue where the new page (which starts at 4096) will not be mapped. It seems correct in this case to check that variable is within one page and read 4 bytes instead of 8. One more thing I am uncertain about is if we change everything to read 4 bytes with 4-byte alignment, what should be done with the first case? Should we panic? (I am not sure if this is an option.) Should we perform the operation twice for addresses 0x0 and 0x4? ~ Oleksii

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.