[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GICv3: Aarch32: Need guidance on the atomic access of "union host_lpi" or if ITS is supported on R52



Hi,

On 28/10/2022 13:49, Andre Przywara wrote:
On Fri, 28 Oct 2022 12:44:08 +0100
Ayan Kumar Halder <ayankuma@xxxxxxx> wrote:

On 27/10/2022 15:36, Andre Przywara wrote:
On Thu, 27 Oct 2022 14:38:52 +0100
Ayan Kumar Halder <ayankuma@xxxxxxx> wrote:

Hi Ayan,
Hi Andre / Julien,
On 25/10/2022 14:55, Andre Przywara wrote:
On Tue, 25 Oct 2022 13:25:52 +0100
Ayan Kumar Halder <ayankuma@xxxxxxx> wrote:

Hi,
Hi Andre,

Many thanks for the explanation.

I need a clarification on the issue of atomic access to 64bit normal
memory on R52.
Hi Andre/All,

This came up while porting Xen on R52.

Refer "ARM DDI 0568A.cID110520", B1.3.1

"A Generic Interrupt Controller (GIC) implemented with an Armv8-R PE
must not implement LPI support. "

Does this mean ITS is not supported on R52 ? I am assuming yes, please
correct me if mistaken.
An ITS relies on LPIs, so yes: no ITS on a v8-R32 system. I cannot find
this restriction anymore in the v8-R64 supplement, so it would only apply
to the R52/AArch32.
If the answer is no, then my next query is follows :-
Answering to that anyway ...
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 9ca74bc321..dea60aac0d 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -423,7 +423,7 @@ int gicv3_lpi_init_host_lpis(unsigned int host_lpi_bits)
         int rc;

         /* We rely on the data structure being atomically accessible. */
-    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));
+    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(uint64_t));

"unsigned long" on Aarch32 is 32 bits. So this bug gets triggered.

Is it fine to change it as above ?

Reading"ARM DDI 0487G.bID07202", E2.2.1, "Requirements for single-copy
atomicity".

"LDRD and STRD accesses to 64-bit aligned locations are 64-bit
single-copy atomic as seen by translation table walks and accesses to
translation tables"
This (and the explaining paragraph) indeed suggests that this should
work architecturally, if you use normal system memory (where you would
hold page tables). It would be confined to ARMv8 AArch32 and ARMv7 w/
LPAE, which matches Xen's requirements.
Does it mean that ldrd/strd will not work atomically on AArch32-v8R as
it uses MPU (not MMU, so no page tables) ?
No, this mentioning of page tables is more an example or a rationale, than
a requirement.
What this means (in the ARMv7-A/ARMv8-A AArch32 context) it:
Because on v7A-LPAE and v8-AArch32 PTEs are 64 bits wide, it's too painful
to use explicit locking to make sure just writing one PTE is atomic. So
the architecture demands that 64-bit aligned accesses using ldrd/strd
are single-copy atomic, so software can update just one PTE easily. But
this is only required for locations where page tables typically reside, so
system memory. This avoids this 64-bit atomicity requirement for *every*
part of the system, for instance separate buses, SRAM or flash on smaller
buses, or MMIO in general.

I don't find anything in the v8-R32 supplement that would step back from
this requirement, although indeed the original reason (atomic PTE writes)
would not apply to v8-R32. Both the LDRD/STRD description and the section
listing differences in the system memory architecture do not mention
anything, so I'd say that the ldrd atomicity requirement still holds.

Please note that this only applies to ARMv7 *LPAE* systems, but Xen
requires LPAE, and R52 is v8, so we are good, and the Xen code can rely on
this.

So for Xen on ARMv8-R32:
*LDRD/STRD* accesses to *64-bit aligned* addresses in *RAM* would be
atomic. You need to satisfy all three requirements:
- You must use ldrd/strd. Just dereferencing a uint64_t pointer in C does
not guarantee that, but read_atomic()/write_atomic() does.
- It must be 64-bit aligned. Shouldn't be a problem if the data type is
64 bits wide. Please note the slight nastiness that ldrd would silently
work on non-aligned addresses, but would lose the atomicity guarantee.
ldrexd would always fault if the address is not aligned.
We might want to check the alignment of data we access (assert?), if not
done already.

For Arm32, we have configured the processor to enforce the alignment for all store/load. So I believe, it means a ldrd/strd on an unaligned address would crash at least on arm32.

We should ensure this will also be the case for R52 (assuming that strd/ldrd are considered atomic).

- It must be in system RAM, so not MMIO. Also I think TCM might be a
different story, but I would hope Xen would not use that directly.
Many thanks for the nice explanation.

I am trying to compare this with the atomicity requirement for AArch64
(ARM DDI 0487G.b ID072021, B2.2.1 Requirements for single-copy atomicit )

I seethat the alignment requirement is the same as for ARMv8-R32.

"-A read that is generated by a load instruction that loads a single
general-purpose register and is aligned to the size of the read in the
instruction is single-copy atomic.

-A write that is generated by a store instruction that stores a single
general-purpose register and is aligned to the size of the write in the
instruction is single-copy atomic"

I think the following code change should help us to confirm the correct
behavior of atomic read/write on both AArch64 and AArch32 (including R52).

diff --git a/xen/arch/arm/include/asm/atomic.h
b/xen/arch/arm/include/asm/atomic.h
index ac2798d095..f22c65a853 100644
--- a/xen/arch/arm/include/asm/atomic.h
+++ b/xen/arch/arm/include/asm/atomic.h
@@ -78,6 +78,7 @@ static always_inline void read_atomic_size(const
volatile void *p,
                                              void *res,
                                              unsigned int size)
   {
+    ASSERT(IS_ALIGNED((unsigned long)p, size));

Please use vaddr_t instead of "unsigned long".

       switch ( size )
       {
       case 1:
@@ -102,6 +103,7 @@ static always_inline void write_atomic_size(volatile
void *p,
                                               void *val,
                                               unsigned int size)
   {
+    ASSERT(IS_ALIGNED((unsigned long)p, size));
       switch ( size )
       {
       case 1:

Please let me know if I misunderstand something.

Yes, that looks correct. Even the more "simple" atomic accesses always
require alignment, so if you do an unaligned 32-bit read in AArch64, it
wouldn't be single-copy atomic either.

This patch looks fine as a way to enforce the address is always aligned. And that would be a nice improvement for arm64. On arm32 it should not matter (see above why).

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.