[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] ARM: cache coherence problem in guestcopy.c



On Tue, 2013-06-18 at 11:22 +0000, Jaeyong Yoo wrote:
> > 
> > So I think we probably actually need the dcache flush in domain_map_page
> > at the "/* Commandeer this 2MB slot */" point. In that context I don't
> > think we can avoid flushing anything other than the complete 2MB
> > mapping. Does this work for you too?
> 
> I am not sure that this would work. If we map_domain_page and 
> unmap_domain_page
> with the same mfn over and over again while the ref count is not zero (say 
> 5), 
> then flush is not called. And, I think we should call flush according to the 
> reason below:
> 
> > 
> > The laziness of the remappings makes me wonder though. Do you know if
> > the slot is reused between step #2 and #3?  Otherwise I'd expect us to
> > reuse the existing mapping with the cache intact. The caches are PIPT so
> > I wouldn't expect the address aliasing to be an issue. Unless the
> > mapping is reused for something else I'm not too sure where the cache
> > pollution is coming from.
> 
> Let me try explain in more detail. 
> We can consider the DomU as a producer (writing values to mfn) and
> hypervisor as a consumer (reading values from mfn). While DomU is 
> invoking multiple hypercalls, it reuses the same mfn and the same 
> mapping at xen page table.
> 
>        (consumer)             (producer)
>            xen                   DomU
>              \                    /   (writing path)
>             (cache)              /
>                  \              /
> (reading path)    \           /
> _______________________________________
>                     |   mfn   |        (physical memory)
> ---------------------------------------
> 
> If we see the above figure, xen "may" keep reading the cached value while 
> DomU is writing different values to mfn.

But all of the caches on this platform are PIPT (right?) so isn't it
actually:

       (consumer)             (producer)
            xen                   DomU
              \                 /   (writing path)
               \               /
                \             /
 (reading path)  \           /
                  \         /
                    (cache)
                       ||
                       ||
                       \/
 _______________________________________
                     |   mfn   |        (physical memory)
 ---------------------------------------
 

Or are you saying that the writing path is uncached?

I was chatting with Tim and he suggested that the issue might be the
ReOrder Buffer, which is virtually tagged. In that case a DMB ought to
be sufficient and not a full cache flush, we think.

We were also speculating that we probably want some DMBs in
context_switch_{from,to} as well as at return_to_guest.

>  Here goes my observation where
> cache pollution happen:
> 
> The pollution actually happens in second line of cache.
> DomU side hypercall param local address is 0xc785fe38 (cache line size = 
> 0x40B),
> and the size of hypercall param is 24B. So the hypercall param lays out in two
> cache lines. When hypervisor is reading the hypercall param, it reads the 
> first 
> 8 bytes correctly (means the first cache line is flushed) and the other 16 
> bytes
> are polluted (means the second cache line is not flushed).
> Honestly, I'm not sure why the first cache line is flushed and the second is 
> not.
> I think we can also cache_line_align the hypercall param struct, but that is 
> only
> when  the sizes of all hypercall params are smaller than cache line size.
> 
> I hope the alignment of the figure is not broken :)
> 
> Best,
> Jaeyong



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.