[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 2/4] xen/x86: Drop unnecessary barriers
On 16/08/17 17:47, Andrew Cooper wrote: > On 16/08/17 16:23, Jan Beulich wrote: >>>>> On 16.08.17 at 13:22, <andrew.cooper3@xxxxxxxxxx> wrote: >>> x86's current implementation of wmb() is a compiler barrier. As a result, >>> the >>> only change in this patch is to remove an mfence instruction from >>> cpuidle_disable_deep_cstate(). >>> >>> None of these barriers serve any purpose. Most aren't aren't synchronising >>> with any remote cpus, where as the mcetelem barriers are redundant with >>> spin_unlock(), which already has full read/write barrier semantics. >>> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> For the relevant parts >> Acked-by: Jan Beulich <jbeulich@xxxxxxxx> >> For the parts the ack doesn't extend to, however: >> >>> --- a/xen/arch/x86/mm/shadow/multi.c >>> +++ b/xen/arch/x86/mm/shadow/multi.c >>> @@ -3112,7 +3112,6 @@ static int sh_page_fault(struct vcpu *v, >>> * will make sure no inconsistent mapping being translated into >>> * shadow page table. */ >>> version = atomic_read(&d->arch.paging.shadow.gtable_dirty_version); >>> - rmb(); >>> walk_ok = sh_walk_guest_tables(v, va, &gw, error_code); >> Isn't this supposed to make sure version is being read first? I.e. >> doesn't this at least need to be barrier()? > atomic_read() is not free to be reordered by the compiler. It is an asm > volatile with a volatile memory reference. > >>> index a459e99..d5b6049 100644 >>> --- a/xen/drivers/passthrough/amd/iommu_init.c >>> +++ b/xen/drivers/passthrough/amd/iommu_init.c >>> @@ -558,7 +558,6 @@ static void parse_event_log_entry(struct amd_iommu >>> *iommu, u32 entry[]) >>> return; >>> } >>> udelay(1); >>> - rmb(); >>> code = get_field_from_reg_u32(entry[1], IOMMU_EVENT_CODE_MASK, >>> IOMMU_EVENT_CODE_SHIFT); >>> } >>> @@ -663,7 +662,6 @@ void parse_ppr_log_entry(struct amd_iommu *iommu, u32 >>> entry[]) >>> return; >>> } >>> udelay(1); >>> - rmb(); >>> code = get_field_from_reg_u32(entry[1], IOMMU_PPR_LOG_CODE_MASK, >>> IOMMU_PPR_LOG_CODE_SHIFT); >>> } >> With these fully removed, what keeps the compiler from moving >> the entry[1] reads out of the loop? Implementation details of >> udelay() don't count... > It is a write to the control variable which is derived from a non-local > non-constant object. It can't be hoisted at all. > > Consider this simplified version: > > while ( count == 0 ) > count = entry[1]; > > If entry were const, the compiler would be free to expect that the value > doesn't change on repeated reads, but that is not the case here. (And continuing my run of luck today), it turns out that GCC does compile my example here to an infinite loop. ffff82d08026025f: 84 c0 test %al,%al ffff82d080260261: 75 0a jne ffff82d08026026d <parse_ppr_log_entry+0x29> ffff82d080260263: 8b 46 04 mov 0x4(%rsi),%eax ffff82d080260266: c1 e8 1c shr $0x1c,%eax ffff82d080260269: 84 c0 test %al,%al ffff82d08026026b: 74 fc je ffff82d080260269 <parse_ppr_log_entry+0x25> I will move this to being a barrer() with a hoisting comment (to avoid it looking like an SMP issue), and I'm going to have to re-evaluate how sane I think the C standard to be. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |