[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v7 01/12] vpci: introduce per-domain lock to protect vpci structure
On Wed, Jul 05, 2023 at 09:11:10AM +0200, Jan Beulich wrote: > On 04.07.2023 23:03, Volodymyr Babchuk wrote: > > I am currently implementing your proposal (along with Jan's > > suggestions), but I am facing ABBA deadlock with IOMMU's > > reassign_device() call, which has this piece of code: > > > > list_move(&pdev->domain_list, &target->pdev_list); > > > > My immediate change was: > > > > write_lock(&pdev->domain->pci_lock); > > list_del(&pdev->domain_list); > > write_unlock(&pdev->domain->pci_lock); > > > > write_lock(&target->pci_lock); > > list_add(&pdev->domain_list, &target->pdev_list); > > write_unlock(&target->pci_lock); > > > > But this will not work because reassign_device is called from > > pci_release_devices() which iterates over d->pdev_list, so we need to > > take a d->pci_lock early. > > > > Any suggestions on how to fix this? My idea is to remove a device from a > > list one at time: > > > > int pci_release_devices(struct domain *d) > > { > > struct pci_dev *pdev; > > u8 bus, devfn; > > int ret; > > > > pcidevs_lock(); > > write_lock(&d->pci_lock); > > ret = arch_pci_clean_pirqs(d); > > if ( ret ) > > { > > pcidevs_unlock(); > > write_unlock(&d->pci_lock); > > return ret; > > } > > > > while ( !list_empty(&d->pdev_list) ) > > { > > pdev = list_entry(&d->pdev_list, struct pci_dev, domain_list); > > bus = pdev->bus; > > devfn = pdev->devfn; > > list_del(&pdev->domain_list); > > write_unlock(&d->pci_lock); > > ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret; > > write_lock(&d->pci_lock); > > I think it needs doing almost like this, but with two more tweaks and > no list_del() right here (first and foremost to avoid needing to > figure whether removing early isn't going to subtly break anything; > see below for an error case that would end up with changed behavior): > > while ( !list_empty(&d->pdev_list) ) > { > const struct pci_dev *pdev = list_first_entry(&d->pdev_list, struct > pci_dev, domain_list); > uint16_t seg = pdev->seg; > uint8_t bus = pdev->bus; > uint8_t devfn = pdev->devfn; > > write_unlock(&d->pci_lock); I think you need to remove the device from the pdev_list before dropping the lock, or else release could race with other operations. That's unlikely, but still if the lock is dropped and the routine needs to operate on the device it is better remove such device from the domain so other operations cannot get a reference to it. Otherwise you could modify reassign_device() implementations so they require the caller to hold the source->pci_lock when calling the routine, but that's ugly because the lock would need to be dropped in order to reassign the device from source to target domains. Another option would be to move the whole d->pdev_list to a local variable (so that d->pdev_list would be empty) and then iterate over it without the d->pci_lock. On failure you would take the lock and add the failing device back into d->pdev_list. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |