[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch-4.16] arm/smmuv1,v2: Protect smmu master list with a lock





On 27/10/2021 11:41, Michal Orzel wrote:
Hi Julien,

Hi Michal,

On 26.10.2021 18:56, Julien Grall wrote:
Hi,

On 26/10/2021 17:28, Julien Grall wrote:
On 26/10/2021 13:29, Michal Orzel wrote:
If a device is added to SMMUv1/v2 from DT and PCI
at the same time, there is a concurrent access
to a smmu master list. This could lead to a
scenario where one is looking into a list that
is being modified at the same time. Add a lock
to prevent this issue.

Did you intend to say "Hold" rather than "Add"?

Yes, this is what I meant. I will change it.


Reuse the existing spinlock arm_smmu_devices_lock
as it is already protecting find_smmu_master.

ipmmu-smmu and smmuv3 are not impacted by this
issue as there is no access or modification of
a global resource during add_device.

Signed-off-by: Michal Orzel <michal.orzel@xxxxxxx>
---
This patch aims for 4.16 release.
Benefits:
Remove a bug that could lead to a corruption of the
smmu master list, which would be very hard to debug.

  From my understanding, this corruption would only happen with CONFIG_HAS_PCI. 
At the moment, this is a experimental feature as it is not fully complete.

Actually, digging through the code, I noticed that iommu_add_device() assume 
that it can only be called with platform/DT. Thankfully, AFAICT, the function 
would return -ENXIO because the fwspec would not be allocated for PCI device.

So was this patch tested with additional patch on top?

The purpose of this patch is to fix the issue that is present and which you 
thankfully noticed.
There is still a lot of patches to be added that aim to support PCI passthrough.
The complete PCI series is tested with SMMU and it works.
But there is no chance to test this patch standalone as iommu_add_device is not 
in the correct form for PCI as of now.

Ok. I would suggest to say this is a latent bug so it make clear that the patch is more for hardening at the moment.

   xen/drivers/passthrough/arm/smmu.c | 25 +++++++++++++++++++------
   1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index c9dfc4caa0..be62a66a28 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -820,21 +820,25 @@ static int arm_smmu_dt_add_device_legacy(struct 
arm_smmu_device *smmu,
                        struct device *dev,
                        struct iommu_fwspec *fwspec)
   {
-    int i;
+    int i, ret;
       struct arm_smmu_master *master;
       struct device_node *dev_node = dev_get_dev_node(dev);
+    spin_lock(&arm_smmu_devices_lock);
       master = find_smmu_master(smmu, dev_node);
       if (master) {
           dev_err(dev,
               "rejecting multiple registrations for master device %s\n",
               dev_node->name);
-        return -EBUSY;
+        ret = -EBUSY;
+        goto out_unlock;
       }
       master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL);
-    if (!master)
-        return -ENOMEM;
+    if (!master) {
+        ret = -ENOMEM;
+        goto out_unlock;
+    }
       master->of_node = dev_node;
       /* Xen: Let Xen know that the device is protected by an SMMU */
@@ -846,11 +850,17 @@ static int arm_smmu_dt_add_device_legacy(struct 
arm_smmu_device *smmu,
               dev_err(dev,
                   "stream ID for master device %s greater than maximum allowed 
(%d)\n",
                   dev_node->name, smmu->num_mapping_groups);
-            return -ERANGE;
+            ret = -ERANGE;
+            goto out_unlock;
           }
           master->cfg.smendx[i] = INVALID_SMENDX;
       }
-    return insert_smmu_master(smmu, master);
+
+    ret = insert_smmu_master(smmu, master);
+
+out_unlock:
+    spin_unlock(&arm_smmu_devices_lock);

Please add a newline here.

Ok.

+    return ret;
   }
   static int register_smmu_master(struct arm_smmu_device *smmu,
@@ -2056,7 +2066,10 @@ static int arm_smmu_add_device(struct device *dev)
       } else {
           struct arm_smmu_master *master;
+        spin_lock(&arm_smmu_devices_lock);
           master = find_smmu_master(smmu, dev->of_node);
+        spin_unlock(&arm_smmu_devices_lock);

At the moment, unlocking here is fine because we don't remove the device. 
However, there are a series to supporting removing a device (see [1]). So I 
think it would be preferable to unlock after the last use of 'cfg'.

Ok. I will move unlocking to the end of this else {} block. I was not aware of 
the patch you are referring to.

I think the end of the else is still too early. This needs to at least be past iommu_group_set_iommudata() because we store cfg.

Potentially, the lock wants to also englobe arm_smmu_master_alloc_smes(). So I am wondering whether it would be simpler to hold the lock for the whole duration of arm_smmu_add_device() (I can see value when we will want to interlock with the remove code).


Looking at the code, I also noticed that the error path is not correct for at 
least the PCI device and we would leak memory. We also assume that Stream ID == 
Requester ID. Are both of them in your radar for PCI passthrough?

I agree with you. I also noticed it. However this is going to be fixed with the 
next patch series when Rahul gets back.

Good. Do you have a todo list for PCI passthrough that can be shared?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.