[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 01/12] vpci: introduce per-domain lock to protect vpci structure


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 5 Jul 2023 11:13:03 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FBxPdtL2k+f4qliEAI+mXmeIC4RVSonltR3JhEIQmbE=; b=IvALZep3JIw/jAkR0Qz3BuWo5RLYyfB2j1he4Lpk/CLE2Efs9Ce+aFhJ5dZ5mdhf+LHcoiL3OcB2x7mOU1fYdWTHAiPMuVeflYHrS05c6f2ok6gHCfRVOMtYotfR0opugJoJqE/gyfzUrnH+PG97WDaiV9phHVPcez+qMPhEs/6ojNn8YUPNoQipOAPxkcbVkCWPEWzfNMQk1knm4+3bYROnIZYPZdY9kY3M2NXVkN9zq2MkVY0FWoi8ViH9mzuiHE+cV1JpKvd6u/A3C69xAJxs+O0KNze5y29+IZwHneDFWCDXmyvrz3gxGjCRtptcSH3xK722e73y6nWMymTcrA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BrHmlSK/zm0oG6NGBSpgVwSAVumzhhDefHD0n6SHm669mhalEMDI7ehNSVMb3XwB9dVM0nOtKozzLdhHqFsqKdz2ZQJJLfz5xLru6eTCdUsxIjqVNFQbwJm5SGTtiAyXOkU6Pr1eilPkbRYk1cHfBJOa9bkpaMsiwxpWBxyLXzBnUODXutLlZK5drf0f1GeIjjEV65LKNCALnoJWm6LZxP3Cp7oE+/6NAcDGIf5zJga6BttHfOjop3S/XyGHPNsDSPTQz8pChO29AucLVMdMeO12dJ91dSqIiRChhY/cIzOiZDTXchPTOdz/X8+RsB4Vsq2+aMTFD9rmDHxTlYanxQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 05 Jul 2023 09:13:53 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 05.07.2023 10:59, Roger Pau Monné wrote:
> On Wed, Jul 05, 2023 at 09:11:10AM +0200, Jan Beulich wrote:
>> On 04.07.2023 23:03, Volodymyr Babchuk wrote:
>>> I am currently implementing your proposal (along with Jan's
>>> suggestions), but I am facing ABBA deadlock with IOMMU's
>>> reassign_device() call, which has this piece of code:
>>>
>>>         list_move(&pdev->domain_list, &target->pdev_list);
>>>
>>> My immediate change was:
>>>
>>>         write_lock(&pdev->domain->pci_lock);
>>>         list_del(&pdev->domain_list);
>>>         write_unlock(&pdev->domain->pci_lock);
>>>
>>>         write_lock(&target->pci_lock);
>>>         list_add(&pdev->domain_list, &target->pdev_list);
>>>         write_unlock(&target->pci_lock);
>>>
>>> But this will not work because reassign_device is called from
>>> pci_release_devices() which iterates over d->pdev_list, so we need to
>>> take a d->pci_lock early.
>>>
>>> Any suggestions on how to fix this? My idea is to remove a device from a
>>> list one at time:
>>>
>>> int pci_release_devices(struct domain *d)
>>> {
>>>     struct pci_dev *pdev;
>>>     u8 bus, devfn;
>>>     int ret;
>>>
>>>     pcidevs_lock();
>>>     write_lock(&d->pci_lock);
>>>     ret = arch_pci_clean_pirqs(d);
>>>     if ( ret )
>>>     {
>>>         pcidevs_unlock();
>>>         write_unlock(&d->pci_lock);
>>>         return ret;
>>>     }
>>>
>>>     while ( !list_empty(&d->pdev_list) )
>>>     {
>>>         pdev = list_entry(&d->pdev_list, struct pci_dev, domain_list);
>>>         bus = pdev->bus;
>>>         devfn = pdev->devfn;
>>>         list_del(&pdev->domain_list);
>>>         write_unlock(&d->pci_lock);
>>>         ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret;
>>>         write_lock(&d->pci_lock);
>>
>> I think it needs doing almost like this, but with two more tweaks and
>> no list_del() right here (first and foremost to avoid needing to
>> figure whether removing early isn't going to subtly break anything;
>> see below for an error case that would end up with changed behavior):
>>
>>     while ( !list_empty(&d->pdev_list) )
>>     {
>>         const struct pci_dev *pdev = list_first_entry(&d->pdev_list, struct 
>> pci_dev, domain_list);
>>         uint16_t seg = pdev->seg;
>>         uint8_t bus = pdev->bus;
>>         uint8_t devfn = pdev->devfn;
>>
>>         write_unlock(&d->pci_lock);
> 
> I think you need to remove the device from the pdev_list before
> dropping the lock, or else release could race with other operations.
> 
> That's unlikely, but still if the lock is dropped and the routine
> needs to operate on the device it is better remove such device from
> the domain so other operations cannot get a reference to it.
> 
> Otherwise you could modify reassign_device() implementations so they
> require the caller to hold the source->pci_lock when calling the
> routine, but that's ugly because the lock would need to be dropped in
> order to reassign the device from source to target domains.
> 
> Another option would be to move the whole d->pdev_list to a local
> variable (so that d->pdev_list would be empty) and then iterate over
> it without the d->pci_lock.  On failure you would take the lock and
> add the failing device back into d->pdev_list.

Conceptually I like this last variant, but like the individual list_del()
it requires auditing code for no dependency on the device still being on
that list. In fact deassign_device()'s use of pci_get_pdev() does. The
function would then need changing to have struct pci_dev * passed in.
Yet who knows where else there are uses of pci_get_pdev() lurking.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.