[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: fix gnttab_need_iommu_mapping



Hi Stefano,

On 06/02/2021 00:38, Stefano Stabellini wrote:
Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.

Doh :/.

The offending chunk is: >
  #define gnttab_need_iommu_mapping(d)                    \
-    (is_domain_direct_mapped(d) && need_iommu(d))
+    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))

On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
directly mapped, like the old check did,

This is not entirely correct, we only need gnttab_need_iommu_mapping() to return true when the domain is direct mapped **and** the IOMMU is enabled for the domain.

but the new check is always
false. >
In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
need_sync is set as:

     if ( !is_hardware_domain(d) || iommu_hwdom_strict )
         hd->need_sync = !iommu_use_hap_pt(d); >
iommu_hwdom_strict is actually supposed to be ignored on ARM, see the
definition in docs/misc/xen-command-line.pandoc:

     This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
     other domains in the system don't live in a compatible address space), and
     is ignored for ARM.

But aside from that, the issue is that iommu_use_hap_pt(d) is true,
hence, hd->need_sync is false, and gnttab_need_iommu_mapping(d) is false
too.

need_sync means that you have a separate IOMMU page-table and they need to be updated for every change.

hap_pt means the page-table used by the IOMMU is the P2M.

For Arm, we always shared the P2M with the IOMMU.


As a consequence, when using PV network from a domU on a system where
IOMMU is on from Dom0, I get:

(XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, 
iova=0x8424cb148, fsynr=0xb0001, cb=0
[   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK

The fix is to go back to the old implementation of
gnttab_need_iommu_mapping.  However, we don't even need to specify &&
need_iommu(d) since we don't actually need to check for the IOMMU to be
enabled (iommu_map does it for us at the beginning of the function.)

gnttab_need_iommu_mapping() doesn't only gate the iommu_legacy_{,un}map() call but also decides whether we need to held both the local and remote grant-table write lock for the duration of the operation (see double_gt_lock()).

I'd like to avoid the requirement to held the double_gt_lock() if there is the domain is going to use the IOMMU.


This fix is preferrable to changing the implementation of need_sync or
iommu_use_hap_pt because "need_sync" is not really the reason why we
want gnttab_need_iommu_mapping to return true.

In 4.13, we introduced is_iommu_enabled() (see commit c45f59292367 "domain: introduce XEN_DOMCTL_CDF_iommu flag") that should do the job for this patch.

For 4.12, we could use iommu_enabled as in general dom0 will use an IOMMU if Xen enable it globally. Note that 4.12 is only security supported since last October (see [1]). So this would be up to patch there tree.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
Backport: 4.12+

I would suggest to use Fixes: tag if you know the exact commit. This would make easier for downstream users if they backported the offending patch.


---

It is incredible that it was missed for this long, but it takes a full
PV drivers test using DMA from a non-coherent device to trigger it, e.g.
wget from a domU to an HTTP server on a different machine, ping or
connections to dom0 won't trigger the bug.

Great finding!


It is interesting that given that IOMMU is on for dom0, Linux could
have just avoided using swiotlb-xen and everything would have just
worked. It is worth considering introducing a feature flag (e.g.
XENFEAT_ARM_dom0_iommu) to let dom0 know that the IOMMU is on and
swiotlb-xen is not necessary.
Then Linux can avoid initializing
swiotlb-xen and just rely on the IOMMU for translation.

The presence of an IOMMU on the system doesn't necessarily indicate that all the devices will be protected by an IOMMU. We can only turn off the swiotlb-xen when we know that **all** the devices are protected.

Therefore a simple feature flag is not going to do the job. Instead, we need to tell Linux which devices has been protected by an IOMMU. This is something I attempted to do a few years ago (see [2]).

In addition to that, we also need to know whether Linux is capable to disable swiotlb-xen. This would allow us to turn off all the mitigation we introduced in Xen for direct mapped domain. One possibility would be to introduce ELF note like for Arm (see [3]).


diff --git a/xen/include/asm-arm/grant_table.h 
b/xen/include/asm-arm/grant_table.h
index 6f585b1538..2a154d1851 100644
--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -88,8 +88,7 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t 
mfn,
  #define gnttab_status_gfn(d, t, i)                                       \
      (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
-#define gnttab_need_iommu_mapping(d) \
-    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
+#define gnttab_need_iommu_mapping(d)  (is_domain_direct_mapped(d))
#endif /* __ASM_GRANT_TABLE_H__ */
  /*


Cheers,

[1] https://xenbits.xen.org/docs/unstable/support-matrix.html
[2] https://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/234523.html [3] https://patchwork.kernel.org/project/linux-arm-kernel/patch/5342AF59.3030405@xxxxxxxxxx/


--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.