|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 3/5] lib/arm: Add I/O memory copy helpers
On 01.11.2025 12:56, Oleksii Moisieiev wrote:
> --- /dev/null
> +++ b/xen/include/xen/lib/io.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Generic I/O memory copy function prototypes.
> + *
> + * These functions provide low-level implementation for copying data between
> + * regular memory and I/O memory regions. Each architecture must provide its
> + * own implementation based on the specific requirements of the
> architecture's
> + * memory model and I/O access patterns.
> + *
> + * Architecture-specific implementations:
> + * =====================================
> + * Each architecture should implement these functions in xen/lib/<arch>/io.c
> + * based on their hardware requirements:
> + *
> + * - ARM/ARM64: Requires special I/O accessors (readl_relaxed,
> writel_relaxed)
> + * with proper memory barriers and alignment handling.
> + * See xen/lib/arm/io.c for implementation.
> + *
> + * - x86/x86_64: I/O memory is directly accessible, so typically uses:
> + * #define memcpy_fromio memcpy
> + * #define memcpy_toio memcpy
> + * See xen/arch/x86/dmi_scan.c for example usage.
I'm not quite sure this is true, especially when now we prefer to use REP MOVSB
there. You don't provide the x86 implementation anyway, so it's not clear
whether
it's a good idea so say anything here about (unclear) future code. (What may be
appropriate in dmi_scan.c, where it's really RAM that is being accessed, may not
be appropriate for actual MMIO.)
> + * - Other architectures (RISC-V, PowerPC, MIPS, etc.): Should provide their
> + * own implementations following the function signatures defined below.
> + *
> + * Naming Convention:
> + * ==================
> + * The double underscore (__) prefix indicates these are low-level "raw"
> + * implementation functions, following the Linux kernel convention for
> + * architecture-specific primitives. This warns developers that these
> + * functions have specific requirements and should not be confused with
> + * regular memcpy().
I disagree here. We really should stop violating name spacing rules set forth
by the spec and/or Misra.
> + */
> +
> +#ifndef _XEN_LIB_IO_H
> +#define _XEN_LIB_IO_H
> +
> +#include <xen/types.h>
> +
> +/*
> + * __memcpy_fromio - Copy data from I/O memory space to regular memory
> + * @to: Destination buffer in regular memory
> + * @from: Source address in I/O memory space (must be marked __iomem)
> + * @count: Number of bytes to copy
> + *
> + * This function handles copying from memory-mapped I/O regions using
> + * architecture-appropriate I/O accessor functions. It ensures proper:
> + * - Memory ordering and barriers
> + * - Alignment requirements
> + * - Hardware-specific access semantics
> + *
> + * Each architecture provides its own implementation that may:
> + * - Use special I/O accessor functions (ARM: readl_relaxed, readb_relaxed)
> + * - Implement alignment handling for devices requiring specific access sizes
> + * - Add memory barriers to ensure ordering with other I/O operations
> + * - Or simply map to memcpy() if the architecture allows direct I/O access
> + */
> +extern void __memcpy_fromio(void *to, const volatile void __iomem *from,
> + size_t count);
> +
> +/*
> + * __memcpy_toio - Copy data from regular memory to I/O memory space
> + * @to: Destination address in I/O memory space (must be marked __iomem)
> + * @from: Source buffer in regular memory
> + * @count: Number of bytes to copy
> + *
> + * This function handles copying to memory-mapped I/O regions using
> + * architecture-appropriate I/O accessor functions. It ensures proper:
> + * - Memory ordering and barriers
> + * - Alignment requirements
> + * - Hardware-specific access semantics
> + *
> + * Each architecture provides its own implementation that may:
> + * - Use special I/O accessor functions (ARM: writel_relaxed, writeb_relaxed)
> + * - Implement alignment handling for devices requiring specific access sizes
> + * - Add memory barriers to ensure ordering with other I/O operations
> + * - Or simply map to memcpy() if the architecture allows direct I/O access
> + */
> +extern void __memcpy_toio(volatile void __iomem *to, const void *from,
> + size_t count);
> +
> +#endif /* _XEN_LIB_IO_H */
There are no provisions here for an arch using macro aliasing, as you suggest in
the comment further up for x86.
> --- /dev/null
> +++ b/xen/lib/arm/io.c
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#include <asm/io.h>
> +#include <xen/lib/io.h>
> +
> +/*
> + * These functions use 32-bit (uint32_t) IO operations rather than 64-bit for
> + * the following reasons:
> + *
> + * 1. ARM32/ARM64 compatibility: On ARM32, there is no atomic 64-bit IO
> accessor
> + * (readq_relaxed). Only readq_relaxed_non_atomic() exists, which
> internally
> + * performs two separate 32-bit reads. Using it would not provide any
> + * performance benefit and could introduce ordering issues.
> + *
> + * 2. Hardware compatibility: Many IO devices only support 32-bit aligned
> accesses.
> + * 64-bit accesses might not be supported or could cause bus errors on
> some
> + * hardware.
> + *
> + * 3. Simplicity: Using 32-bit operations keeps the code simple,
> maintainable,
> + * and consistent across both ARM32 and ARM64 architectures without
> + * architecture-specific conditionals.
> + *
> + * The performance difference between 32-bit and 64-bit operations in this
> + * context is negligible compared to the IO access latency itself.
> + */
> +
> +/*
> + * memcpy_fromio - Copy data from IO memory space to "real" memory space.
> + * @to: Where to copy to
> + * @from: Where to copy from
> + * @count: The size of the area.
> + */
> +void __memcpy_fromio(void *to, const volatile void __iomem *from,
> + size_t count)
> +{
> + while ( count && !IS_ALIGNED((unsigned long)from, 4) )
> + {
> + *(uint8_t *)to = readb_relaxed(from);
> + from++;
> + to++;
> + count--;
> + }
> +
> + while ( count >= 4 )
> + {
> + *(uint32_t *)to = readl_relaxed(from);
Is this going to be fine on Arm32, when "to" doesn't point at a 4-byte-aligned
location?
> + from += 4;
> + to += 4;
> + count -= 4;
> + }
> +
> + while ( count )
> + {
> + *(uint8_t *)to = readb_relaxed(from);
> + from++;
> + to++;
> + count--;
> + }
> +}
> +
> +/*
> + * memcpy_toio - Copy data from "real" memory space to IO memory space.
> + * @to: Where to copy to
> + * @from: Where to copy from
> + * @count: The size of the area.
> + */
> +void __memcpy_toio(volatile void __iomem *to, const void *from,
> + size_t count)
> +{
> + while ( count && !IS_ALIGNED((unsigned long)to, 4) )
> + {
> + writeb_relaxed(*(const uint8_t *)from, to);
> + from++;
> + to++;
> + count--;
> + }
> +
> + while ( count >= 4 )
> + {
> + writel_relaxed(*(const uint32_t *)from, to);
> + from += 4;
> + to += 4;
> + count -= 4;
> + }
> +
> + while ( count )
> + {
> + writeb_relaxed(*(const uint8_t *)from, to);
> + from++;
> + to++;
> + count--;
> + }
> +}
Is use of {read,write}l_relaxed() appropriate here anyway, when those
functions do endian-ness conversions (which memcpy()-like functions clearly
shouldn't do)?
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |