Xen project Mailing List

Re: [Xen-devel] [RFC 15/16] xen/arm: Implement Set/Way operations

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Mon, 5 Nov 2018 13:10:21 -0800 (PST)

Cc: sstabellini@xxxxxxxxxx, andre.przywara@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxx

Delivery-date: Mon, 05 Nov 2018 21:10:27 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, 8 Oct 2018, Julien Grall wrote: > Set/Way operations are used to perform maintenance on a given cache. > At the moment, Set/Way operations are not trapped and therefore a guest > OS will directly act on the local cache. However, a vCPU may migrate to > another pCPU in the middle of the processor. This will result to have > cache with stall data (Set/Way are not propagated) potentially causing > crash. This may be the cause of heisenbug noticed in Osstest [1]. > > Furthermore, Set/Way operations are not available on system cache. This > means that OS, such as Linux 32-bit, relying on those operations to > fully clean the cache before disabling MMU may break because data may > sits in system caches and not in RAM. > > For more details about Set/Way, see the talk "The Art of Virtualizing > Cache Maintenance" given at Xen Summit 2018 [2]. > > In the context of Xen, we need to trap Set/Way operations and emulate > them. From the Arm Arm (B1.14.4 in DDI 046C.c), Set/Way operations are > difficult to virtualized. So we can assume that a guest OS using them will > suffer the consequence (i.e slowness) until developer removes all the usage > of Set/Way. > > As the software is not allowed to infer the Set/Way to Physical Address > mapping, Xen will need to go through the guest P2M and clean & > invalidate all the entries mapped. > > Because Set/Way happen in batch (a loop on all Set/Way of a cache), Xen > would need to go through the P2M for every instructions. This is quite > expensive and would severely impact the guest OS. The implementation is > re-using the KVM policy to limit the number of flush: > - If we trap a Set/Way operations, we enable VM trapping (i.e ^ remove 'a' > HVC_EL2.TVM) to detect cache being turned on/off, and do a full > clean. "do a full clean" straight away, right? May I suggest a rewording of this item: - as soon as we trap a set/way operation, we enable VM trapping (i.e. HVC_EL2.TVM, it ll allow us to detect cache being turned on/off), then we do a full clean > - We clean the caches when turning on and off "We clean the caches when the guest turns caches on or off" > - Once the caches are enabled, we stop trapping VM instructions > > [1] https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg03191.html > [2] https://fr.slideshare.net/xen_com_mgr/virtualizing-cache > > Signed-off-by: Julien Grall <julien.grall@xxxxxxx> > --- > xen/arch/arm/arm64/vsysreg.c | 27 +++++++++++++++++- > xen/arch/arm/p2m.c | 68 > ++++++++++++++++++++++++++++++++++++++++++++ > xen/arch/arm/traps.c | 2 +- > xen/arch/arm/vcpreg.c | 23 +++++++++++++++ > xen/include/asm-arm/p2m.h | 16 +++++++++++ > 5 files changed, 134 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c > index 1517879697..43c6c3e30d 100644 > --- a/xen/arch/arm/arm64/vsysreg.c > +++ b/xen/arch/arm/arm64/vsysreg.c > @@ -40,7 +40,20 @@ static bool vreg_emulate_##reg(struct cpu_user_regs *regs, > \ > } > > /* Defining helpers for emulating sysreg registers. */ > -TVM_REG(SCTLR_EL1) > +static bool vreg_emulate_SCTLR_EL1(struct cpu_user_regs *regs, uint64_t *r, > + bool read) > +{ > + struct vcpu *v = current; > + bool cache_enabled = vcpu_has_cache_enabled(v); > + > + GUEST_BUG_ON(read); > + WRITE_SYSREG(*r, SCTLR_EL1); > + > + p2m_toggle_cache(v, cache_enabled); > + > + return true; > +} > + > TVM_REG(TTBR0_EL1) > TVM_REG(TTBR1_EL1) > TVM_REG(TCR_EL1) > @@ -84,6 +97,18 @@ void do_sysreg(struct cpu_user_regs *regs, > break; > > /* > + * HCR_EL2.TSW > + * > + * ARMv8 (DDI 0487B.b): Table D1-42 > + */ > + case HSR_SYSREG_DCISW: > + case HSR_SYSREG_DCCSW: > + case HSR_SYSREG_DCCISW: > + if ( hsr.sysreg.read ) Shouldn't it be !hsr.sysreg.read ? > + p2m_set_way_flush(current); > + break; > + > + /* > * HCR_EL2.TVM > * > * ARMv8 (DDI 0487B.b): Table D1-37 > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index df6b48d73b..a3d4c563b1 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -1564,6 +1564,74 @@ int p2m_cache_flush_range(struct domain *d, gfn_t > start, gfn_t end) > return 0; > } > > +static void p2m_flush_vm(struct vcpu *v) > +{ > + struct p2m_domain *p2m = p2m_get_hostp2m(v->domain); > + > + /* XXX: Handle preemption */ Yes, good to have this reminder. Maybe add "we'd want to break the operation down when it takes too long". > + p2m_cache_flush_range(v->domain, p2m->lowest_mapped_gfn, > + p2m->max_mapped_gfn); > +} > + > +/* > + * See note at ARMv7 ARM B1.14.4 (DDI 0406C.c) (TL;DR: S/W ops are not > + * easily virtualized). > + * > + * Main problems: > + * - S/W ops are local to a CPU (not broadcast) > + * - We have line migration behind our back (speculation) > + * - System caches don't support S/W at all (damn!) > + * > + * In the face of the above, the best we can do is to try and convert > + * S/W ops to VA ops. Because the guest is not allowed to infer the S/W > + * to PA mapping, it can only use S/W to nuke the whole cache, which is > + * rather a good thing for us. > + * > + * Also, it is only used when turning caches on/off ("The expected > + * usage of the cache maintenance instructions that operate by set/way > + * is associated with the powerdown and powerup of caches, if this is > + * required by the implementation."). > + * > + * We use the following policy: > + * - If we trap a S/W operation, we enabled VM trapping to detect > + * caches being turned on/off, and do a full clean. > + * > + * - We flush the caches on both caches being turned on and off. > + * > + * - Once the caches are enabled, we stop trapping VM ops. > + */ > +void p2m_set_way_flush(struct vcpu *v) > +{ > + /* This function can only work with the current vCPU. */ > + ASSERT(v == current); NIT: if it can only operate on current, it makes sense to remove the struct vcpu* parameter > + if ( !(v->arch.hcr_el2 & HCR_TVM) ) > + { > + p2m_flush_vm(v); > + vcpu_hcr_set_flags(v, HCR_TVM); > + } > +} > + > +void p2m_toggle_cache(struct vcpu *v, bool was_enabled) > +{ > + bool now_enabled = vcpu_has_cache_enabled(v); > + > + /* This function can only work with the current vCPU. */ > + ASSERT(v == current); NIT: same about struct vcpu* as parameter when only current can be used > + /* > + * If switching the MMU+caches on, need to invalidate the caches. > + * If switching it off, need to clean the caches. > + * Clean + invalidate does the trick always. > + */ > + if ( was_enabled != now_enabled ) > + p2m_flush_vm(v); > + > + /* Caches are now on, stop trapping VM ops (until a S/W op) */ > + if ( now_enabled ) > + vcpu_hcr_clear_flags(v, HCR_TVM); > +} > + > mfn_t gfn_to_mfn(struct domain *d, gfn_t gfn) > { > return p2m_lookup(d, gfn, NULL); > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > index 169b57cb6b..cdc10eee5a 100644 > --- a/xen/arch/arm/traps.c > +++ b/xen/arch/arm/traps.c > @@ -98,7 +98,7 @@ register_t get_default_hcr_flags(void) > { > return (HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM| > (vwfi != NATIVE ? (HCR_TWI|HCR_TWE) : 0) | > - HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB); > + HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP|HCR_FB|HCR_TSW); > } > > static enum { > diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c > index 49529b97cd..dc46d9d0d7 100644 > --- a/xen/arch/arm/vcpreg.c > +++ b/xen/arch/arm/vcpreg.c > @@ -45,9 +45,14 @@ > #define TVM_REG(sz, func, reg...) \ > static bool func(struct cpu_user_regs *regs, uint##sz##_t *r, bool read) \ > { \ > + struct vcpu *v = current; \ > + bool cache_enabled = vcpu_has_cache_enabled(v); \ > + \ > GUEST_BUG_ON(read); \ > WRITE_SYSREG##sz(*r, reg); \ > \ > + p2m_toggle_cache(v, cache_enabled); \ This will affect all the registers trapped with TVM. Shouldn't we only call p2m_toggle_cache when relevant? i.e. when changing SCTLR? I think it would be better to only modify the SCTLR emulation handler. > return true; \ > } > > @@ -65,6 +70,8 @@ static bool func(struct cpu_user_regs *regs, uint##sz##_t > *r, bool read) \ > static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, uint32_t *r, \ > bool read, bool hi) \ > { \ > + struct vcpu *v = current; \ > + bool cache_enabled = vcpu_has_cache_enabled(v); \ > register_t reg = READ_SYSREG(xreg); \ > \ > GUEST_BUG_ON(read); \ > @@ -80,6 +87,8 @@ static bool vreg_emulate_##xreg(struct cpu_user_regs *regs, > uint32_t *r, \ > } \ > WRITE_SYSREG(reg, xreg); \ > \ > + p2m_toggle_cache(v, cache_enabled); \ > + \ > return true; \ > } \ > \ > @@ -98,6 +107,7 @@ static bool vreg_emulate_##hireg(struct cpu_user_regs > *regs, uint32_t *r, \ > > /* Defining helpers for emulating co-processor registers. */ > TVM_REG32(SCTLR, SCTLR_EL1) > + Spurious change. Should be in a previous patch? > /* > * AArch32 provides two way to access TTBR* depending on the access > * size, whilst AArch64 provides one way. > @@ -180,6 +190,19 @@ void do_cp15_32(struct cpu_user_regs *regs, const union > hsr hsr) > break; > > /* > + * HCR_EL2.TSW > + * > + * ARMv7 (DDI 0406C.b): B1.14.6 > + * ARMv8 (DDI 0487B.b): Table D1-42 > + */ > + case HSR_CPREG32(DCISW): > + case HSR_CPREG32(DCCSW): > + case HSR_CPREG32(DCCISW): > + if ( !cp32.read ) > + p2m_set_way_flush(current); > + break; > + > + /* > * HCR_EL2.TVM > * > * ARMv8 (DDI 0487B.b): Table D1-37 > diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h > index 92213dd1ab..c470f062db 100644 > --- a/xen/include/asm-arm/p2m.h > +++ b/xen/include/asm-arm/p2m.h > @@ -231,6 +231,10 @@ int p2m_set_entry(struct p2m_domain *p2m, > */ > int p2m_cache_flush_range(struct domain *d, gfn_t start, gfn_t end); > > +void p2m_set_way_flush(struct vcpu *v); > + > +void p2m_toggle_cache(struct vcpu *v, bool was_enabled); > + > /* > * Map a region in the guest p2m with a specific p2m type. > * The memory attributes will be derived from the p2m type. > @@ -358,6 +362,18 @@ static inline int set_foreign_p2m_entry(struct domain > *d, unsigned long gfn, > return -EOPNOTSUPP; > } > > +/* > + * A vCPU has cache enabled only when the MMU is enabled and data cache > + * is enabled. > + */ > +static inline bool vcpu_has_cache_enabled(struct vcpu *v) > +{ > + /* Only works with the current vCPU */ > + ASSERT(current == v); NIT: same about struct vcpu *v as parameter when only current makes sense > + return (READ_SYSREG32(SCTLR_EL1) & (SCTLR_C|SCTLR_M)) == > (SCTLR_C|SCTLR_M); > +} > + > #endif /* _XEN_P2M_H */ > > /* _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.