[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen
> On 19 May 2023, at 15:46, Julien Grall <julien@xxxxxxx> wrote: > > Hi Luca, > > On 19/05/2023 15:26, Luca Fancellu wrote: >>> On 18 May 2023, at 10:35, Julien Grall <julien@xxxxxxx> wrote: >>>> /* >>>> * Comment from Linux: >>>> * Userspace may perform DC ZVA instructions. Mismatched block sizes >>>> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S >>>> new file mode 100644 >>>> index 000000000000..4d1549344733 >>>> --- /dev/null >>>> +++ b/xen/arch/arm/arm64/sve-asm.S >>>> @@ -0,0 +1,48 @@ >>>> +/* SPDX-License-Identifier: GPL-2.0-only */ >>>> +/* >>>> + * Arm SVE assembly routines >>>> + * >>>> + * Copyright (C) 2022 ARM Ltd. >>>> + * >>>> + * Some macros and instruction encoding in this file are taken from linux >>>> 6.1.1, >>>> + * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified >>>> + * version. >>> AFAICT, the only modified version is _sve_rdvl, but it is not clear to me >>> why we would want to have a modified version? >>> >>> I am asking this because without an explanation, it would be difficult to >>> know how to re-sync the code with Linux. >> In this patch the macros are exactly equal to Linux, apart from the coding >> style that uses spaces instead of tabs, >> I was not expecting to keep them in sync as they seems to be not prone to >> change soon, let me know if I need to >> use also tabs and be 100% equal to Linux. > > The file is small enough, so I think it would be OK if this is converted to > the Xen coding style. > >> The following macros that are coming in patch 5 are equal apart from >> sve_save/sve_load, that are different because >> of the construction differences between the storage buffers here and in >> Linux, if you want I can put a comment on them >> to explain this difference in patch 5 > > That would be good. Also, can you update arch/arm/README.LinuxPrimitives? The > file is listing primitives imported from Linux and when. Sure I will > >>>> >>>> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c >>>> new file mode 100644 >>>> index 000000000000..6f3fb368c59b >>>> --- /dev/null >>>> +++ b/xen/arch/arm/arm64/sve.c >>>> @@ -0,0 +1,50 @@ >>>> +/* SPDX-License-Identifier: GPL-2.0 */ >>> >>> Above, you are using GPL-2.0-only, but here GPL-2.0. We favor the former >>> now. Happy to deal it on commit if there is nothing else to address. >> No problem, I will fix in the next push >>> >>>> +/* >>>> + * Arm SVE feature code >>>> + * >>>> + * Copyright (C) 2022 ARM Ltd. >>>> + */ >>>> + >>>> +#include <xen/types.h> >>>> +#include <asm/arm64/sve.h> >>>> +#include <asm/arm64/sysregs.h> >>>> +#include <asm/processor.h> >>>> +#include <asm/system.h> >>>> + >>>> +extern unsigned int sve_get_hw_vl(void); >>>> + >>>> +register_t compute_max_zcr(void) >>>> +{ >>>> + register_t cptr_bits = get_default_cptr_flags(); >>>> + register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS); >>>> + unsigned int hw_vl; >>>> + >>>> + /* Remove trap for SVE resources */ >>>> + WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2); >>>> + isb(); >>>> + >>>> + /* >>>> + * Set the maximum SVE vector length, doing that we will know the VL >>>> + * supported by the platform, calling sve_get_hw_vl() >>>> + */ >>>> + WRITE_SYSREG(zcr, ZCR_EL2); >>> >>> From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a >>> system register would need to be followed by an context synchronization >>> event (e.g. isb()) before the software can rely on the value. >>> >>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the >>> content of ZCR_EL2. So don't we need an ISB() here? >> From what I’ve read in the manual for ZCR_ELx: >> An indirect read of ZCR_EL2.LEN appears to occur in program order relative >> to a direct write of >> the same register, without the need for explicit synchronization >> I’ve interpreted it as “there is no need to sync before write” and I’ve >> looked into Linux and it does not >> Appear any synchronisation mechanism after a write to that register, but if >> I am wrong I can for sure >> add an isb if you prefer. > > Ah, I was reading the generic section about synchronization and didn't > realize there was a paragraph in the ZCR_EL2 section as well. > > Reading the new section, I agree with your understanding. The isb() is not > necessary. > > So please ignore this comment :). > >>>> /* XXX MPU */ >>>> /* Fault Status */ >>>> @@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n) >>>> p2m_restore_state(n); >>>> /* Control Registers */ >>>> + WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2); >>> >>> I would prefer if this called closer to vfp_restore_state(). So the >>> dependency between the two is easier to spot. >>> >>>> WRITE_SYSREG(n->arch.cpacr, CPACR_EL1); >>>> /* >>>> @@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n) >>>> #endif >>>> isb(); >>>> + /* VFP */ >>> >>> Please document in the code that vfp_restore_state() have to be called >>> after CPTR_EL2() + a synchronization event. >>> >>> Similar docoumentation on top of at least CPTR_EL2 and possibly isb(). This >>> would help if we need to re-order the code in the future. >> I will put comments on top of CPTR_EL2 and vfp_restore_state to explain the >> sequence and the synchronisation. > > Just to clarify, does this mean you will keep CPTR_EL2 where it currently is? > (See my comment just above in the previous e-mail) This is how I changed the code: /* Control Registers */ /* * CPTR_EL2 needs to be written before calling vfp_restore_state, a * synchronization instruction is expected after the write (isb) */ WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2); WRITE_SYSREG(n->arch.cpacr, CPACR_EL1); /* * This write to sysreg CONTEXTIDR_EL1 ensures we don't hit erratum * #852523 (Cortex-A57) or #853709 (Cortex-A72). * I.e DACR32_EL2 is not correctly synchronized. */ WRITE_SYSREG(n->arch.contextidr, CONTEXTIDR_EL1); WRITE_SYSREG(n->arch.tpidr_el0, TPIDR_EL0); WRITE_SYSREG(n->arch.tpidrro_el0, TPIDRRO_EL0); WRITE_SYSREG(n->arch.tpidr_el1, TPIDR_EL1); if ( is_32bit_domain(n->domain) && cpu_has_thumbee ) { WRITE_SYSREG(n->arch.teecr, TEECR32_EL1); WRITE_SYSREG(n->arch.teehbr, TEEHBR32_EL1); } #ifdef CONFIG_ARM_32 WRITE_CP32(n->arch.joscr, JOSCR); WRITE_CP32(n->arch.jmcr, JMCR); #endif isb(); /* VFP - call vfp_restore_state after writing on CPTR_EL2 + isb */ vfp_restore_state(n); Maybe I misunderstood your preference, do you want me to put the write to CPTR_EL2 right before the isb() that precedes vfp_restore_state? > > Cheers, > > -- > Julien Grall
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |