[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen


  • To: Julien Grall <julien@xxxxxxx>
  • From: Luca Fancellu <Luca.Fancellu@xxxxxxx>
  • Date: Fri, 19 May 2023 14:51:32 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kitbsA1zlG6W+qc+6IX2A383irQKmXMjPHJcr6b9TrI=; b=YPcjxZws80q/Cr0WfWz4RLRd++MZOS9wjoPI0eTV+DjBk4qVijaOkRFw6hAHfuJ9wG7cAe+uR9NBqNG5e86vJXDiwDDhTfpCCadyVBeSVUzGx761tvcbeEj0+qGVROb4R2HolrSWNtoa5FjLGQAkx8DaPiCF8llCF6h/IKFQrGdOzuSqoGi28j6aujH0qIK5uM3OXBAE3+x+zaYEDary4tjo4Vy/P5TjTyvZ5mvH+fO1hD8C2m5CxKZqxcgyBQzsxdJMkAXfrTexQ5CkgVnpO3CCfaPSNZTJx/ZnRGGmAeosSbUOZjExUlRakAjNnKSLN6n7eZasOi6UivI5+GqLfw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=T+AMeJtkGSbBT4d4lDewMxsJQIK53hofJsDVy1+FTAhBtAvSVrjoE8pItf/n97DzRlYOhTgwo5egiMHub0UaX2vOS9zVG60ophsBpaXiGQzModXzi2jx8FSQ4rcZWYdehBMD8erzjZIgxZqfC0I1Xdi+HEQEvGgXa2SDLqqTqJtmRmQMoZHLsaua+vY0cfuoDfic/mp9M/5/MW+JSAXC+A9ZcE5HmveUfYZMrSbAhEDG5A+okJabqL63H++74asmLxm47cjKvncfEnwurlmC01H6Aof5Q12aFvG/50ANzdbPxAHvqFjiFvkfquGBcRCUzJPfAVVp4JfgE+Bk5zryWw==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Fri, 19 May 2023 14:52:06 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHZdnKBJmFHv6Wq8kSjvXctD+zlI69f6zCAgAHjioCAAAXEgIAAAV4A
  • Thread-topic: [PATCH v6 01/12] xen/arm: enable SVE extension for Xen


> On 19 May 2023, at 15:46, Julien Grall <julien@xxxxxxx> wrote:
> 
> Hi Luca,
> 
> On 19/05/2023 15:26, Luca Fancellu wrote:
>>> On 18 May 2023, at 10:35, Julien Grall <julien@xxxxxxx> wrote:
>>>>   /*
>>>>    * Comment from Linux:
>>>>    * Userspace may perform DC ZVA instructions. Mismatched block sizes
>>>> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
>>>> new file mode 100644
>>>> index 000000000000..4d1549344733
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/arm64/sve-asm.S
>>>> @@ -0,0 +1,48 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>> +/*
>>>> + * Arm SVE assembly routines
>>>> + *
>>>> + * Copyright (C) 2022 ARM Ltd.
>>>> + *
>>>> + * Some macros and instruction encoding in this file are taken from linux 
>>>> 6.1.1,
>>>> + * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
>>>> + * version.
>>> AFAICT, the only modified version is _sve_rdvl, but it is not clear to me 
>>> why we would want to have a modified version?
>>> 
>>> I am asking this because without an explanation, it would be difficult to 
>>> know how to re-sync the code with Linux.
>> In this patch the macros are exactly equal to Linux, apart from the coding 
>> style that uses spaces instead of tabs,
>> I was not expecting to keep them in sync as they seems to be not prone to 
>> change soon, let me know if I need to
>> use also tabs and be 100% equal to Linux.
> 
> The file is small enough, so I think it would be OK if this is converted to 
> the Xen coding style.
> 
>> The following macros that are coming in patch 5 are equal apart from 
>> sve_save/sve_load, that are different because
>> of the construction differences between the storage buffers here and in 
>> Linux, if you want I can put a comment on them
>> to explain this difference in patch 5
> 
> That would be good. Also, can you update arch/arm/README.LinuxPrimitives? The 
> file is listing primitives imported from Linux and when.

Sure I will

> 
>>>> 
>>>> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
>>>> new file mode 100644
>>>> index 000000000000..6f3fb368c59b
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/arm64/sve.c
>>>> @@ -0,0 +1,50 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> 
>>> Above, you are using GPL-2.0-only, but here GPL-2.0. We favor the former 
>>> now. Happy to deal it on commit if there is nothing else to address.
>> No problem, I will fix in the next push
>>> 
>>>> +/*
>>>> + * Arm SVE feature code
>>>> + *
>>>> + * Copyright (C) 2022 ARM Ltd.
>>>> + */
>>>> +
>>>> +#include <xen/types.h>
>>>> +#include <asm/arm64/sve.h>
>>>> +#include <asm/arm64/sysregs.h>
>>>> +#include <asm/processor.h>
>>>> +#include <asm/system.h>
>>>> +
>>>> +extern unsigned int sve_get_hw_vl(void);
>>>> +
>>>> +register_t compute_max_zcr(void)
>>>> +{
>>>> +    register_t cptr_bits = get_default_cptr_flags();
>>>> +    register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
>>>> +    unsigned int hw_vl;
>>>> +
>>>> +    /* Remove trap for SVE resources */
>>>> +    WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
>>>> +    isb();
>>>> +
>>>> +    /*
>>>> +     * Set the maximum SVE vector length, doing that we will know the VL
>>>> +     * supported by the platform, calling sve_get_hw_vl()
>>>> +     */
>>>> +    WRITE_SYSREG(zcr, ZCR_EL2);
>>> 
>>> From my reading of the Arm (D19-6331, ARM DDI 0487J.a), a direct write to a 
>>> system register would need to be followed by an context synchronization 
>>> event (e.g. isb()) before the software can rely on the value.
>>> 
>>> In this situation, AFAICT, the instruciton in sve_get_hw_vl() will use the 
>>> content of ZCR_EL2. So don't we need an ISB() here?
>> From what I’ve read in the manual for ZCR_ELx:
>> An indirect read of ZCR_EL2.LEN appears to occur in program order relative 
>> to a direct write of
>> the same register, without the need for explicit synchronization
>> I’ve interpreted it as “there is no need to sync before write” and I’ve 
>> looked into Linux and it does not
>> Appear any synchronisation mechanism after a write to that register, but if 
>> I am wrong I can for sure
>> add an isb if you prefer.
> 
> Ah, I was reading the generic section about synchronization and didn't 
> realize there was a paragraph in the ZCR_EL2 section as well.
> 
> Reading the new section, I agree with your understanding. The isb() is not 
> necessary.
> 
> So please ignore this comment :).
> 
>>>>      /* XXX MPU */
>>>>        /* Fault Status */
>>>> @@ -234,6 +231,7 @@ static void ctxt_switch_to(struct vcpu *n)
>>>>      p2m_restore_state(n);
>>>>        /* Control Registers */
>>>> +    WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
>>> 
>>> I would prefer if this called closer to vfp_restore_state(). So the 
>>> dependency between the two is easier to spot.
>>> 
>>>>      WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);
>>>>        /*
>>>> @@ -258,6 +256,9 @@ static void ctxt_switch_to(struct vcpu *n)
>>>>  #endif
>>>>      isb();
>>>>  +    /* VFP */
>>> 
>>> Please document in the code that vfp_restore_state() have to be called 
>>> after CPTR_EL2() + a synchronization event.
>>> 
>>> Similar docoumentation on top of at least CPTR_EL2 and possibly isb(). This 
>>> would help if we need to re-order the code in the future.
>> I will put comments on top of CPTR_EL2 and vfp_restore_state to explain the 
>> sequence and the synchronisation.
> 
> Just to clarify, does this mean you will keep CPTR_EL2 where it currently is? 
> (See my comment just above in the previous e-mail)

This is how I changed the code:

/* Control Registers */
/*
* CPTR_EL2 needs to be written before calling vfp_restore_state, a
* synchronization instruction is expected after the write (isb)
*/
WRITE_SYSREG(n->arch.cptr_el2, CPTR_EL2);
WRITE_SYSREG(n->arch.cpacr, CPACR_EL1);

/*
* This write to sysreg CONTEXTIDR_EL1 ensures we don't hit erratum
* #852523 (Cortex-A57) or #853709 (Cortex-A72).
* I.e DACR32_EL2 is not correctly synchronized.
*/
WRITE_SYSREG(n->arch.contextidr, CONTEXTIDR_EL1);
WRITE_SYSREG(n->arch.tpidr_el0, TPIDR_EL0);
WRITE_SYSREG(n->arch.tpidrro_el0, TPIDRRO_EL0);
WRITE_SYSREG(n->arch.tpidr_el1, TPIDR_EL1);

if ( is_32bit_domain(n->domain) && cpu_has_thumbee )
{
WRITE_SYSREG(n->arch.teecr, TEECR32_EL1);
WRITE_SYSREG(n->arch.teehbr, TEEHBR32_EL1);
}

#ifdef CONFIG_ARM_32
WRITE_CP32(n->arch.joscr, JOSCR);
WRITE_CP32(n->arch.jmcr, JMCR);
#endif
isb();

/* VFP - call vfp_restore_state after writing on CPTR_EL2 + isb */
vfp_restore_state(n);

Maybe I misunderstood your preference, do you want me to put the write to 
CPTR_EL2
right before the isb() that precedes vfp_restore_state?


> 
> Cheers,
> 
> -- 
> Julien Grall


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.