[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch-4.16] arm/smmuv1,v2: Protect smmu master list with a lock



Hi Julien,

On 26.10.2021 18:56, Julien Grall wrote:
> Hi,
> 
> On 26/10/2021 17:28, Julien Grall wrote:
>> On 26/10/2021 13:29, Michal Orzel wrote:
>>> If a device is added to SMMUv1/v2 from DT and PCI
>>> at the same time, there is a concurrent access
>>> to a smmu master list. This could lead to a
>>> scenario where one is looking into a list that
>>> is being modified at the same time. Add a lock
>>> to prevent this issue.
>>
>> Did you intend to say "Hold" rather than "Add"?
>>
Yes, this is what I meant. I will change it.

>>>
>>> Reuse the existing spinlock arm_smmu_devices_lock
>>> as it is already protecting find_smmu_master.
>>>
>>> ipmmu-smmu and smmuv3 are not impacted by this
>>> issue as there is no access or modification of
>>> a global resource during add_device.
>>>
>>> Signed-off-by: Michal Orzel <michal.orzel@xxxxxxx>
>>> ---
>>> This patch aims for 4.16 release.
>>> Benefits:
>>> Remove a bug that could lead to a corruption of the
>>> smmu master list, which would be very hard to debug.
>>
>>  From my understanding, this corruption would only happen with 
>> CONFIG_HAS_PCI. At the moment, this is a experimental feature as it is not 
>> fully complete.
> 
> Actually, digging through the code, I noticed that iommu_add_device() assume 
> that it can only be called with platform/DT. Thankfully, AFAICT, the function 
> would return -ENXIO because the fwspec would not be allocated for PCI device.
> 
> So was this patch tested with additional patch on top?
> 
The purpose of this patch is to fix the issue that is present and which you 
thankfully noticed.
There is still a lot of patches to be added that aim to support PCI passthrough.
The complete PCI series is tested with SMMU and it works.
But there is no chance to test this patch standalone as iommu_add_device is not 
in the correct form for PCI as of now.

>>
>>> Risks:
>>> Overall the risk is low as we are touching init code
>>> rather than a runtime one. In case of any issue, the
>>> problem would be catched during system boot or guest
>>> start.
>>
>> With the PCI feature enabled, this can be called at runtime as this plumbed 
>> through a PHYSDEVOP. That said, it doesn't matter here as for supported code 
>> (platform/DT device), this will only happen during boot.
>>
>> The patch looks straighforward, so I would not mind to have it in Xen 4.16. 
>> Ian, what do you think?
>>
>>> ---
>>>   xen/drivers/passthrough/arm/smmu.c | 25 +++++++++++++++++++------
>>>   1 file changed, 19 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/xen/drivers/passthrough/arm/smmu.c 
>>> b/xen/drivers/passthrough/arm/smmu.c
>>> index c9dfc4caa0..be62a66a28 100644
>>> --- a/xen/drivers/passthrough/arm/smmu.c
>>> +++ b/xen/drivers/passthrough/arm/smmu.c
>>> @@ -820,21 +820,25 @@ static int arm_smmu_dt_add_device_legacy(struct 
>>> arm_smmu_device *smmu,
>>>                        struct device *dev,
>>>                        struct iommu_fwspec *fwspec)
>>>   {
>>> -    int i;
>>> +    int i, ret;
>>>       struct arm_smmu_master *master;
>>>       struct device_node *dev_node = dev_get_dev_node(dev);
>>> +    spin_lock(&arm_smmu_devices_lock);
>>>       master = find_smmu_master(smmu, dev_node);
>>>       if (master) {
>>>           dev_err(dev,
>>>               "rejecting multiple registrations for master device %s\n",
>>>               dev_node->name);
>>> -        return -EBUSY;
>>> +        ret = -EBUSY;
>>> +        goto out_unlock;
>>>       }
>>>       master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL);
>>> -    if (!master)
>>> -        return -ENOMEM;
>>> +    if (!master) {
>>> +        ret = -ENOMEM;
>>> +        goto out_unlock;
>>> +    }
>>>       master->of_node = dev_node;
>>>       /* Xen: Let Xen know that the device is protected by an SMMU */
>>> @@ -846,11 +850,17 @@ static int arm_smmu_dt_add_device_legacy(struct 
>>> arm_smmu_device *smmu,
>>>               dev_err(dev,
>>>                   "stream ID for master device %s greater than maximum 
>>> allowed (%d)\n",
>>>                   dev_node->name, smmu->num_mapping_groups);
>>> -            return -ERANGE;
>>> +            ret = -ERANGE;
>>> +            goto out_unlock;
>>>           }
>>>           master->cfg.smendx[i] = INVALID_SMENDX;
>>>       }
>>> -    return insert_smmu_master(smmu, master);
>>> +
>>> +    ret = insert_smmu_master(smmu, master);
>>> +
>>> +out_unlock:
>>> +    spin_unlock(&arm_smmu_devices_lock);
>>
>> Please add a newline here.
>>
Ok.

>>> +    return ret;
>>>   }
>>>   static int register_smmu_master(struct arm_smmu_device *smmu,
>>> @@ -2056,7 +2066,10 @@ static int arm_smmu_add_device(struct device *dev)
>>>       } else {
>>>           struct arm_smmu_master *master;
>>> +        spin_lock(&arm_smmu_devices_lock);
>>>           master = find_smmu_master(smmu, dev->of_node);
>>> +        spin_unlock(&arm_smmu_devices_lock);
>>
>> At the moment, unlocking here is fine because we don't remove the device. 
>> However, there are a series to supporting removing a device (see [1]). So I 
>> think it would be preferable to unlock after the last use of 'cfg'.
>>
Ok. I will move unlocking to the end of this else {} block. I was not aware of 
the patch you are referring to.

>> Looking at the code, I also noticed that the error path is not correct for 
>> at least the PCI device and we would leak memory. We also assume that Stream 
>> ID == Requester ID. Are both of them in your radar for PCI passthrough?
>>
I agree with you. I also noticed it. However this is going to be fixed with the 
next patch series when Rahul gets back.

>> Cheers,
>>
>> [1] 
>> https://lore.kernel.org/xen-devel/1630562763-390068-9-git-send-email-fnu.vikram@xxxxxxxxxx/
>>
> 
> Cheers,
>
Cheers,
Michal



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.