[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1] xen/arm: smmuv3: Add support for removing devices





On 4/7/26 12:58, Mykyta Poturai wrote:

Hello Mykyta

Allow for removing devices from SMMUv3. arm_smmu_deassign_dev handles
most of the work by disabling ATS and zeroing STEs. Additionally, unset
the dt_device_is_protected flag and free no longer needed smmu_master.

Tested on QEMU with SRIOV series[1] by repeatedly enabling/disabling
VFs.

[1]: https://patchew.org/Xen/cover.1772806036.git.mykyta._5Fpoturai@xxxxxxxx/

Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx>
---
  xen/drivers/passthrough/arm/smmu-v3.c | 59 +++++++++++++++++++++++++++
  xen/include/xen/device_tree.h         |  5 +++
  2 files changed, 64 insertions(+)

diff --git a/xen/drivers/passthrough/arm/smmu-v3.c 
b/xen/drivers/passthrough/arm/smmu-v3.c
index bf153227db..b5b834a7b7 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.c
+++ b/xen/drivers/passthrough/arm/smmu-v3.c
@@ -1493,6 +1493,64 @@ static int arm_smmu_assign_dev(struct domain *d, u8 
devfn, struct device *dev,
  static int arm_smmu_deassign_dev(struct domain *d, uint8_t devfn,
                                 struct device *dev);
+static int arm_smmu_remove_device(u8 devfn, struct device *dev)
+{
+       int ret = 0;
+       struct arm_smmu_master *master;
+       struct iommu_fwspec *fwspec;
+
+       fwspec = dev_iommu_fwspec_get(dev);
+       if ( !fwspec )
+               return -ENODEV;
+
+       master = dev_iommu_priv_get(dev);
+       if ( !master )
+               return -ENODEV;
+
+#ifdef CONFIG_HAS_PCI
+       if ( dev_is_pci(dev) )
+       {
+               struct pci_dev *pdev = dev_to_pci(dev);

As Luca has already noticed in a separate email regarding phantom functions:

If I understand the code correctly, the iommu_remove_device() loops over PCI phantom functions, calling ops->remove_device() multiple times for the exact same struct device *dev. Because you omitted the phantom function check (which exists in arm_smmu_add_device), the code will process the first phantom function, fall through to out_free, and destroy the master struct. When the iommu_remove_device() subsequently calls remove_device() for the main devfn, dev_iommu_priv_get(dev) will return NULL, and the removal will abort with -ENODEV.

Should not we ignore phantom functions at the top of the PCI block, just like in the add_device()? if ( devfn != pdev->devfn ) return 0;


+
+               if ( pdev->domain )
+               {
+                       ret = arm_smmu_deassign_dev(pdev->domain, devfn, dev);
+                       if ( ret )
+                               printk(XENLOG_WARNING "Failed to deassign device %pp 
from SMMU\n",
+                                       &pdev->sbdf);

What worries me is a possible state corruption on deassign failure. If arm_smmu_deassign_dev() fails, the code prints a warning but continues execution here ...


+               }
+       }
+#endif
+
+       if ( !dev_is_pci(dev) )


NIT: I think that you could simplify the PCI/platform device split to avoid evaluating dev_is_pci(dev) twice by using #else block.

+       {
+               if ( !dt_device_is_protected(dev_to_dt(dev)) )
+               {
+                       dev_err(dev, "Not added to SMMUv3\n");
+                       ret = -ENODEV;
+                       goto out_free;
+               }
+
+               if ( master->domain && master->domain->d )
+               {
+                       ret = arm_smmu_deassign_dev(master->domain->d, 0, dev);
+                       if ( ret )
+                               dev_warn(dev, "Failed to deassign device from 
SMMU\n");
+               }

  ... and here.

It falls through to the bottom of the function where it frees the master struct. Because you return an error code, the IOMMU framework (specifically in the DT path) will abort the removal. At least, iommu_remove_dt_device() sees the error code and skips freeing the fwspec. But because you freed the master struct, the SMMUv3 driver has lost track of the device, while the common code might think it is still assigned and functional...

Should not we bail out immediately and return the error without freeing master or altering the protected state if arm_smmu_deassign_dev() fails?

Or am I missing something?



+               dt_device_unset_protected(dev_to_dt(dev));
+       }
+
+       arm_smmu_disable_pasid(master);
+
+       dev_info(dev, "Removed master device (SMMUv3 %s StreamIds %u)\n",
+                dev_name(fwspec->iommu_dev), fwspec->num_ids);
+
+out_free:
+       xfree(master);
+       dev_iommu_priv_set(dev, NULL);
+       return ret;
+}
+
  static int arm_smmu_add_device(u8 devfn, struct device *dev)
  {
        int i, ret;
@@ -2867,6 +2925,7 @@ static const struct iommu_ops arm_smmu_iommu_ops = {
        .unmap_page             = arm_iommu_unmap_page,
        .dt_xlate               = arm_smmu_dt_xlate,
        .add_device             = arm_smmu_add_device,
+       .remove_device          = arm_smmu_remove_device,
  };
static __init int arm_smmu_dt_init(struct dt_device_node *dev,
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 06d7643622..1f9608cdcd 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -305,6 +305,11 @@ static inline void dt_device_set_protected(struct 
dt_device_node *device)
      device->is_protected = true;
  }
+static inline void dt_device_unset_protected(struct dt_device_node *device)
+{
+    device->is_protected = false;
+}

NIT: Rather than introducing new helper, it might be possible to update the existing helper to take a boolean: e.g., dt_device_set_protected(struct dt_device_node *device, bool protect). But doing this would require updating existing callers, I am not sure if that broader refactoring is necessary, so your current unset approach is acceptable.


+
  static inline bool dt_device_is_protected(const struct dt_device_node *device)
  {
      return device->is_protected;




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.