[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [for-4.7 v2] xen/arm: Force broadcast of TLB and instruction cache maintenance instructions



On Wed, Apr 27, 2016 at 12:22:53PM +0100, Julien Grall wrote:
> UP guest may use TLB instructions to flush only on the local CPU.
> Therefore, TLB flush will not be broadcasted across all the CPUs within
> the same innershareable domain.
> 
> When the vCPU is migrated between different CPUs, it may be rescheduled
> to a previous CPU where the TLB has not been flushed. The TLB may
> contain stale entries which will result to translate incorrectly a VA to
> IPA or even cause TLB conflicts.
> 
> To avoid a such situation, it is possible to set HCR_EL2.FB, which will
> force the broadcast of TLB and instruction cache maintenance instructions.
> 
> The performance impact of setting HCR_EL2.FB will depend on how often
> a guest makes use of local flush instructions.
> 
> ARM64 Linux kernel is SMP-aware (no possibility to build only for UP).
> Most of the flush instructions are innershareable. The local flushes are
> limited to the boot (1 per CPU) and when a task is getting a new ASIC.
> Therefore the impact of setting HCR.FB for those guests is very limited.
> 
> ARM32 Linux kernel offers the possibility to be built either for SMP or
> UP. The number of local flush is very limited in the former kernel
> whilst the latter will only issue local flushes. Therefore there will be
> an impact to set HCR.FB for guest kernel only built for UP.
> 
> Note that the SMP kernel can run in a domain using 1 vCPU and it
> will still make use of innershareable flush instruction.
> 
> Looking at other OSes, such as FreeBSD, they are very similar to ARM32
> Linux kernel (i.e offering two configuration: SMP and UP).
> 
> However, nothing prevents an SMP-aware kernel to make more often use of
> local flush instrutions.
> 
> In the case that HCR_EL2.FB is not set, Xen would need to:
>     * Flush all the TLBs for the VMID associated to this domain
>     * Invalidate all the entries from the branch predictor
>     * Invalidate all the entries from the instruction cache
> Those actions would only be needed when the vCPU is migrating between 2
> physical CPUs.
> 
> Whilst this solution would have a negative performance impact on kernels
> which do not heavily use local flush instructions, this may improve
> performance for kernels only built for UP system.
> 
> For now implement the easiest solution (i.e setting HCR_EL2.FB). We can
> revisit it if the performance impact is too high for UP kernel.
> 
> Signed-off-by: Julien Grall <julien.grall@xxxxxxx>


Subject to an ack from Stefano:

Release-acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>

> ---
> 
> This is a bug fix for Xen 4.7 and should be backported up to Xen 4.4
> (first official release for ARM). Without this patch, UP guest will
> crash if it gets migrated on a physical CPU with stale TLBs for this
> guest.
> 
>     Changes in v2:
>         - Rework the commit message to include the possible performance
>         impact of setting HCR_EL2.FB.
> ---
>  xen/arch/arm/traps.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 5e865cf..9926a57 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -124,7 +124,8 @@ void init_traps(void)
>  
>      /* Setup hypervisor traps */
>      WRITE_SYSREG(HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
> -                 HCR_TWE|HCR_TWI|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP, 
> HCR_EL2);
> +                 HCR_TWE|HCR_TWI|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB,
> +                 HCR_EL2);
>      isb();
>  }
>  
> -- 
> 1.9.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.