[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 9/9] xen/x86: track dirty pCPU caches for a given vCPU



On Tue, May 06, 2025 at 12:16:00PM +0100, Andrew Cooper wrote:
> On 06/05/2025 9:31 am, Roger Pau Monne wrote:
> > When a guest is allowed access to cache control operations such tracking
> > prevents having to issue a system-wide cache flush, and rather just flush
> > the pCPUs where the vCPU has been scheduled since the last flush.
> >
> > Note that domain-wide flushes accumulate the dirty caches from all the
> > vCPUs, but clearing the vCPU masks will require pausing all vCPUs, which
> > seems overkill.  Instead leave the vCPU dirty masks as-is, worse case it
> > will result in redundant flushes in further calls.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> I'm afraid this doesn't work.
> 
> Unlike TLBs, dirty cacheline can move sideways, e.g. by foreign or grant
> mapping, but also naturally because of how cache coherency works.

Does such sideway moving also imply that local WB{NO,}INVD on native
could be equally bogus?

According to the SDM, cache lines can indeed move between processor
caches, but the memory controller must always snoop such moves and
flush the data to memory:

"Here, the processor with the valid data may pass the data to the
other processors without actually writing it to system memory;
however, it is the responsibility of the memory controller to snoop
this operation and update memory."

So a cache line moving sideways will always be propagated to memory as
part of the move, and hence the data in the previous pCPU cache will
always hit memory.

grants/foreign maps are indeed complex, but sharing non-coherent
across domains seems like a recipe for disaster.  It's maybe mitigated
by doing host-wide cache flushes, but how does the mapping domain know
whether  source domain has possibly dirty caches that need flushing?
IMO it's the source that would need to first flush any cache contents,
and then share the memory.

FWIW, (and not saying this is correct), but KVM uses the same model of
tracking dirty caches, see wbinvd_dirty_mask field in struct
kvm_vcpu_arch.

> We need to use the guarantees given to us by the architecture to simply
> nop out cache flushes when safe to do so.

We already do this when possible AFAICT.

> Everything else is either a shootdown (clflush/opt/clwb, and doesn't
> even trap to Xen), or needs to be a global WB{NO,}INVD.  Partial WBINVDs
> are of no value.

What about on Intel when there's no capability to trap WBINVD?  Xen is
currently flushing the cache of the previous pCPU in case the vCPU has
moved around, see vmx_do_resume().

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.