[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 3/5] lib/arm: Add I/O memory copy helpers


  • To: Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 6 Nov 2025 11:25:00 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 06 Nov 2025 10:25:09 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.11.2025 12:56, Oleksii Moisieiev wrote:
> --- /dev/null
> +++ b/xen/include/xen/lib/io.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Generic I/O memory copy function prototypes.
> + *
> + * These functions provide low-level implementation for copying data between
> + * regular memory and I/O memory regions. Each architecture must provide its
> + * own implementation based on the specific requirements of the 
> architecture's
> + * memory model and I/O access patterns.
> + *
> + * Architecture-specific implementations:
> + * =====================================
> + * Each architecture should implement these functions in xen/lib/<arch>/io.c
> + * based on their hardware requirements:
> + *
> + * - ARM/ARM64: Requires special I/O accessors (readl_relaxed, 
> writel_relaxed)
> + *              with proper memory barriers and alignment handling.
> + *              See xen/lib/arm/io.c for implementation.
> + *
> + * - x86/x86_64: I/O memory is directly accessible, so typically uses:
> + *               #define memcpy_fromio memcpy
> + *               #define memcpy_toio   memcpy
> + *               See xen/arch/x86/dmi_scan.c for example usage.

I'm not quite sure this is true, especially when now we prefer to use REP MOVSB
there. You don't provide the x86 implementation anyway, so it's not clear 
whether
it's a good idea so say anything here about (unclear) future code. (What may be
appropriate in dmi_scan.c, where it's really RAM that is being accessed, may not
be appropriate for actual MMIO.)

> + * - Other architectures (RISC-V, PowerPC, MIPS, etc.): Should provide their
> + *   own implementations following the function signatures defined below.
> + *
> + * Naming Convention:
> + * ==================
> + * The double underscore (__) prefix indicates these are low-level "raw"
> + * implementation functions, following the Linux kernel convention for
> + * architecture-specific primitives. This warns developers that these
> + * functions have specific requirements and should not be confused with
> + * regular memcpy().

I disagree here. We really should stop violating name spacing rules set forth
by the spec and/or Misra.

> + */
> +
> +#ifndef _XEN_LIB_IO_H
> +#define _XEN_LIB_IO_H
> +
> +#include <xen/types.h>
> +
> +/*
> + * __memcpy_fromio - Copy data from I/O memory space to regular memory
> + * @to: Destination buffer in regular memory
> + * @from: Source address in I/O memory space (must be marked __iomem)
> + * @count: Number of bytes to copy
> + *
> + * This function handles copying from memory-mapped I/O regions using
> + * architecture-appropriate I/O accessor functions. It ensures proper:
> + * - Memory ordering and barriers
> + * - Alignment requirements
> + * - Hardware-specific access semantics
> + *
> + * Each architecture provides its own implementation that may:
> + * - Use special I/O accessor functions (ARM: readl_relaxed, readb_relaxed)
> + * - Implement alignment handling for devices requiring specific access sizes
> + * - Add memory barriers to ensure ordering with other I/O operations
> + * - Or simply map to memcpy() if the architecture allows direct I/O access
> + */
> +extern void __memcpy_fromio(void *to, const volatile void __iomem *from,
> +                            size_t count);
> +
> +/*
> + * __memcpy_toio - Copy data from regular memory to I/O memory space
> + * @to: Destination address in I/O memory space (must be marked __iomem)
> + * @from: Source buffer in regular memory
> + * @count: Number of bytes to copy
> + *
> + * This function handles copying to memory-mapped I/O regions using
> + * architecture-appropriate I/O accessor functions. It ensures proper:
> + * - Memory ordering and barriers
> + * - Alignment requirements
> + * - Hardware-specific access semantics
> + *
> + * Each architecture provides its own implementation that may:
> + * - Use special I/O accessor functions (ARM: writel_relaxed, writeb_relaxed)
> + * - Implement alignment handling for devices requiring specific access sizes
> + * - Add memory barriers to ensure ordering with other I/O operations
> + * - Or simply map to memcpy() if the architecture allows direct I/O access
> + */
> +extern void __memcpy_toio(volatile void __iomem *to, const void *from,
> +                          size_t count);
> +
> +#endif /* _XEN_LIB_IO_H */

There are no provisions here for an arch using macro aliasing, as you suggest in
the comment further up for x86.

> --- /dev/null
> +++ b/xen/lib/arm/io.c
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#include <asm/io.h>
> +#include <xen/lib/io.h>
> +
> +/*
> + * These functions use 32-bit (uint32_t) IO operations rather than 64-bit for
> + * the following reasons:
> + *
> + * 1. ARM32/ARM64 compatibility: On ARM32, there is no atomic 64-bit IO 
> accessor
> + *    (readq_relaxed). Only readq_relaxed_non_atomic() exists, which 
> internally
> + *    performs two separate 32-bit reads. Using it would not provide any
> + *    performance benefit and could introduce ordering issues.
> + *
> + * 2. Hardware compatibility: Many IO devices only support 32-bit aligned 
> accesses.
> + *    64-bit accesses might not be supported or could cause bus errors on 
> some
> + *    hardware.
> + *
> + * 3. Simplicity: Using 32-bit operations keeps the code simple, 
> maintainable,
> + *    and consistent across both ARM32 and ARM64 architectures without
> + *    architecture-specific conditionals.
> + *
> + * The performance difference between 32-bit and 64-bit operations in this
> + * context is negligible compared to the IO access latency itself.
> + */
> +
> +/*
> + * memcpy_fromio - Copy data from IO memory space to "real" memory space.
> + * @to: Where to copy to
> + * @from: Where to copy from
> + * @count: The size of the area.
> + */
> +void __memcpy_fromio(void *to, const volatile void __iomem *from,
> +                     size_t count)
> +{
> +    while ( count && !IS_ALIGNED((unsigned long)from, 4) )
> +    {
> +        *(uint8_t *)to = readb_relaxed(from);
> +        from++;
> +        to++;
> +        count--;
> +    }
> +
> +    while ( count >= 4 )
> +    {
> +        *(uint32_t *)to = readl_relaxed(from);

Is this going to be fine on Arm32, when "to" doesn't point at a 4-byte-aligned
location?

> +        from += 4;
> +        to += 4;
> +        count -= 4;
> +    }
> +
> +    while ( count )
> +    {
> +        *(uint8_t *)to = readb_relaxed(from);
> +        from++;
> +        to++;
> +        count--;
> +    }
> +}
> +
> +/*
> + * memcpy_toio - Copy data from "real" memory space to IO memory space.
> + * @to: Where to copy to
> + * @from: Where to copy from
> + * @count: The size of the area.
> + */
> +void __memcpy_toio(volatile void __iomem *to, const void *from,
> +                   size_t count)
> +{
> +    while ( count && !IS_ALIGNED((unsigned long)to, 4) )
> +    {
> +        writeb_relaxed(*(const uint8_t *)from, to);
> +        from++;
> +        to++;
> +        count--;
> +    }
> +
> +    while ( count >= 4 )
> +    {
> +        writel_relaxed(*(const uint32_t *)from, to);
> +        from += 4;
> +        to += 4;
> +        count -= 4;
> +    }
> +
> +    while ( count )
> +    {
> +        writeb_relaxed(*(const uint8_t *)from, to);
> +        from++;
> +        to++;
> +        count--;
> +    }
> +}

Is use of {read,write}l_relaxed() appropriate here anyway, when those
functions do endian-ness conversions (which memcpy()-like functions clearly
shouldn't do)?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.