Xen project Mailing List

Re: [PATCH 2/4] VT-d / x86: re-arrange cache syncing

Date: Thu, 2 Dec 2021 10:19:04 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2n5PImIlT9tbs9yJ0pbbJF3BrhRvqpW0KMCJAPX0tWg=; b=lPN+4naCG54ZI9P5el9gyGy/ovB5eMsmg9i6OPOomfCPCmMHi3LGR8lvQq7nixafWczz0kabAEKtgqtZZezd2W/GOTaofmFZ1zsP6v/I9jSZduDB+IRvr69FJDY6yYOqU3PwSXI6NwCHt5Iso4JMc0l+DjDZpFTaPuTo8nMoC9wSSGslR315V03b+xlP0kTsDVcdS79j3/4CEh1D66YGAOWEgGff/IoM5k9dGgtfaMfjHDIt6WaAJ2JmsTMnYKbo9/6hsU6xomLKE0/uuBDVULbGfmUn9yN25LcvSAaOYC63LmdvOlzO0/H9VjKSt6ciQWxRvWaPzksOqwQNF+QUYA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VFhdlEEwvlXldfCV7IuWc6aaufxfo59+S1Bwc5c3PGxeEP9BEwBrqEAvXKsXG0sbaoVUg3Zr2B2FKlHZKnGU1qQtBRS5y73Z+mkIguc1eISv8LA8GSCkVdFBrLyesNtjNstvVppvrBxWvu3LYkTpm9hoMZMrBSByIR3E+Y2DOfPxsNI26ywWr0jxUYvEMk5sHhz3FmSc5co9r6lqTSMWj7zQLzuboyqT/w+XeJPVPmBQ8aRj3a+7d3Mw8jsVcpxIEOabiTbTs1VhRgh53O9lygx7C3wUwxxmGK33s+74OOooymqT0ul3GvZVcMeHURTzorQYORh4c0euH6/b+5RE9w==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: Paul Durrant <paul@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 02 Dec 2021 09:19:28 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.12.2021 15:39, Andrew Cooper wrote: > On 01/12/2021 09:40, Jan Beulich wrote: >> The actual function should always have lived in core x86 code; move it >> there, replacing get_cache_line_size() by readily available (except very >> early during boot; see the code comment) data. >> >> Drop the respective IOMMU hook, (re)introducing a respective boolean >> instead. Replace a true and an almost open-coding instance of >> iommu_sync_cache(). > > Coherency (or not) is a per-IOMMU property, not a global platform > property. iGPU IOMMUs are very different to those the uncore, and > there's no reason to presume that IOMMUs in the PCH would have the same > properties as those in the uncore. > > Given how expensive sync_cache() is, we cannot afford to be using it for > coherent IOMMUs in a mixed system. > > This wants to be a boolean in arch_iommu. That's a valid consideration, but may not be as easy as it may seem on the surface. Certainly not something I could promise to find time for soon. And definitely separate from the specific change here. >> --- >> Placing the function next to flush_area_local() exposes a curious >> asymmetry between the SFENCE placements: sync_cache() has it after the >> flush, while flush_area_local() has it before it. I think the latter one >> is misplaced. > > Wow this is a mess. > > First and foremost, AMD state that on pre-CLFLUSHOPT parts, CLFLUSH is > unordered with ~anything (including SFENCE), and need bounding with > MFENCE on both sides. We definitely aren't doing this correctly right now. > > > AMD explicitly states that SFENCE drains the store and WC buffers (i.e. > make stuff instantaneously globally visible). Intel does not, and > merely guarantees ordering. > > A leading SFENCE would only make sense if there were WC concerns, but > both manuals say that the memory type doesn't matter, so I can't see a > justification for it. > > What does matter, from the IOMMU's point of view, is that the line has > been written back (or evicted on pre-CLWB parts) before the IOTLB > invalidation occurs. The invalidation will be a write to a different > address, which is why the trailing SFENCE is necessary, as CLFLUSHOPT > isn't ordered with respect to unaliasing writes. IOW for the purposes of this change all is correct, and everything else will require taking care of separately. > On a more minor note, both Intel and AMD say that CLFLUSH* are permitted > to target an execute-only code segment, so we need a fix in > hvmemul_cache_op()'s use of hvmemul_virtual_to_linear(..., > hvm_access_read, ...) which will currently #GP in this case. Hmm, this specific case would seem to require to simply use hvm_access_none (like hvmemul_tlb_op() already does), except for CLWB. But then hvm_vcpu_virtual_to_linear() also doesn't look to handle hvm_access_insn_fetch correctly for data segments. Changing of which would in turn require to first audit all use sites, to make sure we don't break anything. I'll see about doing both. >> --- a/xen/arch/x86/flushtlb.c >> +++ b/xen/arch/x86/flushtlb.c >> @@ -11,6 +11,7 @@ >> #include <xen/sched.h> >> #include <xen/smp.h> >> #include <xen/softirq.h> >> +#include <asm/cache.h> >> #include <asm/flushtlb.h> >> #include <asm/invpcid.h> >> #include <asm/nops.h> >> @@ -265,6 +266,57 @@ unsigned int flush_area_local(const void >> return flags; >> } >> >> +void sync_cache(const void *addr, unsigned int size) > > Can we name this cache_writeback()? sync is very generic, and it really > doesn't want confusing cache_flush(). Oh, sure, can do. As long as you don't insist on also changing iommu_sync_cache(): While purely mechanical, this would bloat the patch by quite a bit. Bottom line: This last item is the only change to the actual patch; everything else will require further work yielding separate patches. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.