[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 15/16] xen/arm: Implement Set/Way operations



On Mon, 8 Oct 2018, Julien Grall wrote:
> Set/Way operations are used to perform maintenance on a given cache.
> At the moment, Set/Way operations are not trapped and therefore a guest
> OS will directly act on the local cache. However, a vCPU may migrate to
> another pCPU in the middle of the processor. This will result to have
> cache with stall data (Set/Way are not propagated) potentially causing
> crash. This may be the cause of heisenbug noticed in Osstest [1].
> 
> Furthermore, Set/Way operations are not available on system cache. This
> means that OS, such as Linux 32-bit, relying on those operations to
> fully clean the cache before disabling MMU may break because data may
> sits in system caches and not in RAM.
> 
> For more details about Set/Way, see the talk "The Art of Virtualizing
> Cache Maintenance" given at Xen Summit 2018 [2].
> 
> In the context of Xen, we need to trap Set/Way operations and emulate
> them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are
> difficult to virtualized. So we can assume that a guest OS using them will
> suffer the consequence (i.e slowness) until developer removes all the usage
> of Set/Way.
> 
> As the software is not allowed to infer the Set/Way to Physical Address
> mapping, Xen will need to go through the guest P2M and clean &
> invalidate all the entries mapped.
> 
> Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen
> would need to go through the P2M for every instructions. This is quite
> expensive and would severely impact the guest OS. The implementation is
> re-using the KVM policy to limit the number of flush:
>     - If we trap a Set/Way operations, we enable VM trapping (i.e
                   ^ remove 'a'

>       HVC_EL2.TVM) to detect cache being turned on/off, and do a full
>     clean.

"do a full clean" straight away, right? May I suggest a rewording of
this item:

- as soon as we trap a set/way operation, we enable VM trapping (i.e.
  HVC_EL2.TVM, it ll allow us to detect cache being turned on/off),
  then we do a full clean


>     - We clean the caches when turning on and off

"We clean the caches when the guest turns caches on or off"


>     - Once the caches are enabled, we stop trapping VM instructions
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html
> [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache
> 
> Signed-off-by: Julien Grall <julien.grall@xxxxxxx>
> ---
>  xen/arch/arm/arm64/vsysreg.c | 27 +++++++++++++++++-
>  xen/arch/arm/p2m.c           | 68 
> ++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/traps.c         |  2 +-
>  xen/arch/arm/vcpreg.c        | 23 +++++++++++++++
>  xen/include/asm-arm/p2m.h    | 16 +++++++++++
>  5 files changed, 134 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index 1517879697..43c6c3e30d 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -40,7 +40,20 @@ static bool vreg_emulate_##reg(struct cpu_user_regs *regs, 
>          \
>  }
>  
>  /* Defining helpers for emulating sysreg registers. */
> -TVM_REG(SCTLR_EL1)
> +static bool vreg_emulate_SCTLR_EL1(struct cpu_user_regs *regs, uint64_t *r,
> +                                   bool read)
> +{
> +    struct vcpu *v = current;
> +    bool cache_enabled = vcpu_has_cache_enabled(v);
> +
> +    GUEST_BUG_ON(read);
> +    WRITE_SYSREG(*r, SCTLR_EL1);
> +
> +    p2m_toggle_cache(v, cache_enabled);
> +
> +    return true;
> +}
> +
>  TVM_REG(TTBR0_EL1)
>  TVM_REG(TTBR1_EL1)
>  TVM_REG(TCR_EL1)
> @@ -84,6 +97,18 @@ void do_sysreg(struct cpu_user_regs *regs,
>          break;
>  
>      /*
> +     * HCR_EL2.TSW
> +     *
> +     * ARMv8 (DDI 0487B.b): Table D1-42
> +     */
> +    case HSR_SYSREG_DCISW:
> +    case HSR_SYSREG_DCCSW:
> +    case HSR_SYSREG_DCCISW:
> +        if ( hsr.sysreg.read )

Shouldn't it be !hsr.sysreg.read ?


> +            p2m_set_way_flush(current);
> +        break;
> +
> +    /*
>       * HCR_EL2.TVM
>       *
>       * ARMv8 (DDI 0487B.b): Table D1-37
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index df6b48d73b..a3d4c563b1 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1564,6 +1564,74 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
> start, gfn_t end)
>      return 0;
>  }
>  
> +static void p2m_flush_vm(struct vcpu *v)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
> +
> +    /* XXX: Handle preemption */

Yes, good to have this reminder. Maybe add "we'd want to break the
operation down when it takes too long".


> +    p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn,
> +                          p2m->max_mapped_gfn);
> +}
> +
> +/*
> + * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not
> + * easily virtualized).
> + *
> + * Main problems:
> + *  - S/W ops are local to a CPU (not broadcast)
> + *  - We have line migration behind our back (speculation)
> + *  - System caches don't support S/W at all (damn!)
> + *
> + * In the face of the above, the best we can do is to try and convert
> + * S/W ops to VA ops. Because the guest is not allowed to infer the S/W
> + * to PA mapping, it can only use S/W to nuke the whole cache, which is
> + * rather a good thing for us.
> + *
> + * Also, it is only used when turning caches on/off ("The expected
> + * usage of the cache maintenance instructions that operate by set/way
> + * is associated with the powerdown and powerup of caches, if this is
> + * required by the implementation.").
> + *
> + * We use the following policy:
> + *  - If we trap a S/W operation, we enabled VM trapping to detect
> + *  caches being turned on/off, and do a full clean.
> + *
> + *  - We flush the caches on both caches being turned on and off.
> + *
> + *  - Once the caches are enabled, we stop trapping VM ops.
> + */
> +void p2m_set_way_flush(struct vcpu *v)
> +{
> +    /* This function can only work with the current vCPU. */
> +    ASSERT(v == current);

NIT: if it can only operate on current, it makes sense to remove the
struct vcpu* parameter


> +    if ( !(v->arch.hcr_el2 & HCR_TVM) )
> +    {
> +        p2m_flush_vm(v);
> +        vcpu_hcr_set_flags(v, HCR_TVM);
> +    }
> +}
> +
> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled)
> +{
> +    bool now_enabled = vcpu_has_cache_enabled(v);
> +
> +    /* This function can only work with the current vCPU. */
> +    ASSERT(v == current);

NIT: same about struct vcpu* as parameter when only current can be used


> +    /*
> +     * If switching the MMU+caches on, need to invalidate the caches.
> +     * If switching it off, need to clean the caches.
> +     * Clean + invalidate does the trick always.
> +     */
> +    if ( was_enabled != now_enabled )
> +        p2m_flush_vm(v);
> +
> +    /* Caches are now on, stop trapping VM ops (until a S/W op) */
> +    if ( now_enabled )
> +        vcpu_hcr_clear_flags(v, HCR_TVM);
> +}
> +
>  mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn)
>  {
>      return p2m_lookup(d, gfn, NULL);
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 169b57cb6b..cdc10eee5a 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -98,7 +98,7 @@ register_t get_default_hcr_flags(void)
>  {
>      return  (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM|
>               (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) |
> -             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB);
> +             HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW);
>  }
>  
>  static enum {
> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> index 49529b97cd..dc46d9d0d7 100644
> --- a/xen/arch/arm/vcpreg.c
> +++ b/xen/arch/arm/vcpreg.c
> @@ -45,9 +45,14 @@
>  #define TVM_REG(sz, func, reg...)                                           \
>  static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read)    \
>  {                                                                           \
> +    struct vcpu *v = current;                                               \
> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
> +                                                                            \
>      GUEST_BUG_ON(read);                                                     \
>      WRITE_SYSREG##sz(*r, reg);                                              \
>                                                                              \
> +    p2m_toggle_cache(v, cache_enabled);                                     \

This will affect all the registers trapped with TVM. Shouldn't we only
call p2m_toggle_cache when relevant? i.e. when changing SCTLR?
I think it would be better to only modify the SCTLR emulation handler.


>      return true;                                                            \
>  }
>  
> @@ -65,6 +70,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t 
> *r, bool read)    \
>  static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r,    \
>                                  bool read, bool hi)                         \
>  {                                                                           \
> +    struct vcpu *v = current;                                               \
> +    bool cache_enabled = vcpu_has_cache_enabled(v);                         \
>      register_t reg = READ_SYSREG(xreg);                                     \
>                                                                              \
>      GUEST_BUG_ON(read);                                                     \
> @@ -80,6 +87,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, 
> uint32_t *r,    \
>      }                                                                       \
>      WRITE_SYSREG(reg, xreg);                                                \
>                                                                              \
> +    p2m_toggle_cache(v, cache_enabled);                                     \
> +                                                                            \
>      return true;                                                            \
>  }                                                                           \
>                                                                              \
> @@ -98,6 +107,7 @@ static bool vreg_emulate_##hireg(struct cpu_user_regs 
> *regs, uint32_t *r,   \
>  
>  /* Defining helpers for emulating co-processor registers. */
>  TVM_REG32(SCTLR, SCTLR_EL1)
> +

Spurious change. Should be in a previous patch?

>  /*
>   * AArch32 provides two way to access TTBR* depending on the access
>   * size, whilst AArch64 provides one way.
> @@ -180,6 +190,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union 
> hsr hsr)
>          break;
>  
>      /*
> +     * HCR_EL2.TSW
> +     *
> +     * ARMv7 (DDI 0406C.b): B1.14.6
> +     * ARMv8 (DDI 0487B.b): Table D1-42
> +     */
> +    case HSR_CPREG32(DCISW):
> +    case HSR_CPREG32(DCCSW):
> +    case HSR_CPREG32(DCCISW):
> +        if ( !cp32.read )
> +            p2m_set_way_flush(current);
> +        break;
> +
> +    /*
>       * HCR_EL2.TVM
>       *
>       * ARMv8 (DDI 0487B.b): Table D1-37
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 92213dd1ab..c470f062db 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -231,6 +231,10 @@ int p2m_set_entry(struct p2m_domain *p2m,
>   */
>  int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end);
>  
> +void p2m_set_way_flush(struct vcpu *v);
> +
> +void p2m_toggle_cache(struct vcpu *v, bool was_enabled);
> +
>  /*
>   * Map a region in the guest p2m with a specific p2m type.
>   * The memory attributes will be derived from the p2m type.
> @@ -358,6 +362,18 @@ static inline int set_foreign_p2m_entry(struct domain 
> *d, unsigned long gfn,
>      return -EOPNOTSUPP;
>  }
>  
> +/*
> + * A vCPU has cache enabled only when the MMU is enabled and data cache
> + * is enabled.
> + */
> +static inline bool vcpu_has_cache_enabled(struct vcpu *v)
> +{
> +    /* Only works with the current vCPU */
> +    ASSERT(current == v);

NIT: same about struct vcpu *v as parameter when only current makes
sense


> +    return (READ_SYSREG32(SCTLR_EL1) & (SCTLR_C|SCTLR_M)) == 
> (SCTLR_C|SCTLR_M);
> +}
> +
>  #endif /* _XEN_P2M_H */
>  
>  /*

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.