[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate



At 16:53 +0100 on 26 Oct (1351270394), Stefano Stabellini wrote:
> On Fri, 26 Oct 2012, Tim Deegan wrote:
> > At 18:35 +0100 on 24 Oct (1351103740), Stefano Stabellini wrote:
> > > > I don't think this is necessary - why not just pass va directly to the
> > > > inline asm?  We don't care what register it's in (and if we did I'm not
> > > > convinced this would guarantee it was r0).
> > > > 
> > > > > +    asm volatile (
> > > > > +        "dsb;"
> > > > > +        STORE_CP32(0, DCCMVAC)
> > > > > +        "isb;"
> > > > > +        : : "r" (r0) : "memory");
> > > > 
> > > > Does this need a 'memory' clobber?  Can we get away with just saying it
> > > > consumes *va as an input?  All we need to be sure of is that the
> > > > particular thing we're flushing has been written out; no need to stop
> > > > any other optimizations.
> > > 
> > > you are right on both points
> > > 
> > > > I guess it might need to be re-cast as a macro so the compiler knows how
> > > > big *va is?
> > > 
> > > I don't think it is necessary, after all the size of a register has to
> > > be the same of a virtual address
> > 
> > But it's the size of the thing in memory that's being flushed that
> > matters, not the size of the pointer to it!
> >
> > E.g. after a PTE write we
> > need a 64-bit memory input operand to stop the compiler from hoisting
> > any part of the PTE write past the cache flush. (well OK we explicitly use
> > a 64-bit atomic write for PTE writes, but YKWIM).
> 
> The implementation of write_pte is entirely in assembly so I doubt that
> the compiler is going to reorder it.

Augh!  Yes, like I said, PTE writes are fine.

> However I see your point in case of flush_xen_dcache_va.
> Wouldn't a barrier() at the beginning of the function be enough?

More than enough.  That would be exactly equivalent to the "memory"
clobber above.  What I'm arguing for is a _less_ restrictive constraint,
that only restricts delaying writes, and only affects the thing actually
being flushed (whatever size that is).

For larger regions we should have a function with a single barrier at
the top and then a loop of DCCMVAC writes.  For single objects smaller
than a cacheline we need to pass the object itself as a memory input
operand.  Probably we should also have a compile-time check that the
object is smaller than the smallest supported cache-line (i.e. one
DCCMVAC is enough).

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.