On 01/15/2016 01:57 AM, Will Deacon wrote:

I think you figured this out while I was sleeping, but just to confirm:

  1. The MIPS64 ISA doc [1] talks about SYNC in a way that applies only
     to memory accesses appearing in *program-order* before the SYNC

  2. We need WRC+sync+addr to work, which means that the SYNC in P1 must
     also capture the store in P0 as being "before" the barrier. Leonid
     reckons it works, but his explanation [2] focussed on the address
     dependency in P2 as to why this works. If that is the case (i.e.
     address dependency provides global transitivity), then WRC+addr+addr
     should also work (even though its not required).

No, it is not correct. There is one old design which provides access to core (thread0 + thread1) write-buffers for threads load in advance of it is visible to other cores. It means, that WRC+sync+addr passes because of SYNC in write thread and register dependency inside other thread but WRC+addr+addr may fail because other core may get a stale data.

  3. It seems that WRC+addr+addr doesn't work, so I'm still suspicious
     about WRC+sync+addr, because neither the architecture document or
     Leonid's explanation tell me that it should be forbidden.


[1] https://imgtec.com/?do-download=4302
[2] http://lkml.kernel.org/r/569565DA.2010903@xxxxxxxxxx (scroll to the end)

