[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen/arm and swiotlb-xen: possible data corruption



On Thu, 2 Mar 2017, Julien Grall wrote:
> On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > > > Hi all,
> > > > 
> > > > Edgar reported a data corruption on network packets in dom0 when the
> > > > swiotlb-xen is in use. He also reported that the following patch "fixes"
> > > > the problem for him:
> > > > 
> > > >  static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t
> > > > handle,
> > > >                 size_t size, enum dma_data_direction dir)
> > > >  {
> > > > -       dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size,
> > > > dir, DMA_MAP);
> > > > +       printk("%s: addr=%lx size=%zd\n", __func__, handle, size);
> > > > +       dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size +
> > > > 64, dir, DMA_MAP);
> > > > 
> > > > I am thinking that the problem has something to do with cacheline
> > > > alignment on the Xen side
> > > > (xen/common/grant_table.c:__gnttab_cache_flush).
> > > > 
> > > > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op
> > > > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The
> > > > parameter, v, could be non-cacheline aligned.
> > > > 
> > > > invalidate_dcache_va_range is capable of handling a not aligned address,
> > > > while clean_dcache_va_range does not.
> > > > 
> > > > Edgar, does the appended patch fix the problem for you?
> > > 
> > > 
> > > Thanks Stefano,
> > > 
> > > This does indeed fix the issue for me.

Thanks for reporting and testing!


> > Hi again,
> > 
> > Looking at the code, the problem here is that we may flush one cache line
> > less than expected.
> > 
> > This smaller patch fixes it for me too:
> > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> > index c492d6d..fa1b4dd 100644
> > --- a/xen/include/asm-arm/page.h
> > +++ b/xen/include/asm-arm/page.h
> > @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p,
> > unsigned long size)
> >  {
> >      const void *end;
> >      dsb(sy);           /* So the CPU issues all writes to the range */
> > -    for ( end = p + size; p < end; p += cacheline_bytes )
> > +
> > +    end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
> > +    for ( ; p < end; p += cacheline_bytes )
> >          asm volatile (__clean_dcache_one(0) : : "r" (p));
> >      dsb(sy);           /* So we know the flushes happen before continuing
> > */
> >      /* ARM callers assume that dcache_* functions cannot fail. */
> > 
> > 
> > Anyway, I'm OK with either fix.
> 
> I would prefer your version compare to Stefano's one.

Julien, from looking at the two diffs, this is simpler and nicer, but if
you look at xen/include/asm-arm/page.h, my patch made
clean_dcache_va_range consistent with invalidate_dcache_va_range. For
consistency, I would prefer to deal with the two functions the same way.
Although it is not a spec requirement, I also think that it is a good
idea to issue cache flushes from cacheline aligned addresses, like
invalidate_dcache_va_range does and Linux does, to make more obvious
what is going on.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.