[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v9 01/16] pci: introduce per-domain PCI rwlock
On Tue, Aug 29, 2023 at 11:19:42PM +0000, Volodymyr Babchuk wrote: > Add per-domain d->pci_lock that protects access to > d->pdev_list. Purpose of this lock is to give guarantees to VPCI code > that underlying pdev will not disappear under feet. This is a rw-lock, > but this patch adds only write_lock()s. There will be read_lock() > users in the next patches. > > This lock should be taken in write mode every time d->pdev_list is > altered. This covers both accesses to d->pdev_list and accesses to > pdev->domain_list fields. Why do you mention pdev->domain_list here? I don't think the lock covers accesses to pdev->domain_list, unless that domain_list field happens to be part of the linked list in d->pdev_list. I find it kind of odd to mention here. > All write accesses also should be protected > by pcidevs_lock() as well. Idea is that any user that wants read > access to the list or to the devices stored in the list should use > either this new d->pci_lock or old pcidevs_lock(). Usage of any of > this two locks will ensure only that pdev of interest will not > disappear from under feet and that the pdev still will be assigned to > the same domain. Of course, any new users should use pcidevs_lock() > when it is appropriate (e.g. when accessing any other state that is > protected by the said lock). In case both the newly introduced > per-domain rwlock and the pcidevs lock is taken, the later must be > acquired first. > > Any write access to pdev->domain_list should be protected by both > pcidevs_lock() and d->pci_lock in the write mode. > > Suggested-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > Suggested-by: Jan Beulich <jbeulich@xxxxxxxx> > Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> > > --- > > Changes in v9: > - returned back "pdev->domain = target;" in AMD IOMMU code > - used "source" instead of pdev->domain in IOMMU functions > - added comment about lock ordering in the commit message > - reduced locked regions > - minor changes non-functional changes in various places > > Changes in v8: > - New patch > > Changes in v8 vs RFC: > - Removed all read_locks after discussion with Roger in #xendevel > - pci_release_devices() now returns the first error code > - extended commit message > - added missing lock in pci_remove_device() > - extended locked region in pci_add_device() to protect list_del() calls > --- > xen/common/domain.c | 1 + > xen/drivers/passthrough/amd/pci_amd_iommu.c | 9 ++- > xen/drivers/passthrough/pci.c | 71 +++++++++++++++++---- > xen/drivers/passthrough/vtd/iommu.c | 9 ++- > xen/include/xen/sched.h | 1 + > 5 files changed, 78 insertions(+), 13 deletions(-) > > diff --git a/xen/common/domain.c b/xen/common/domain.c > index 304aa04fa6..9b04a20160 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -651,6 +651,7 @@ struct domain *domain_create(domid_t domid, > > #ifdef CONFIG_HAS_PCI > INIT_LIST_HEAD(&d->pdev_list); > + rwlock_init(&d->pci_lock); > #endif > > /* All error paths can depend on the above setup. */ > diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c > b/xen/drivers/passthrough/amd/pci_amd_iommu.c > index bea70db4b7..d219bd9453 100644 > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -476,7 +476,14 @@ static int cf_check reassign_device( > > if ( devfn == pdev->devfn && pdev->domain != target ) > { > - list_move(&pdev->domain_list, &target->pdev_list); > + write_lock(&source->pci_lock); > + list_del(&pdev->domain_list); > + write_unlock(&source->pci_lock); > + > + write_lock(&target->pci_lock); > + list_add(&pdev->domain_list, &target->pdev_list); > + write_unlock(&target->pci_lock); > + > pdev->domain = target; While I don't think this is strictly an issue right now, it would be better to set pdev->domain before the device is added to domain_list. A pattern like: read_lock(d->pci_lock); for_each_pdev(d, pdev) foo(pdev->domain); read_unlock(d->pci_lock); Wouldn't work currently if the pdev is added to domain_list before the pdev->domain field is updated to reflect the new owner. > } > > diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c > index 33452791a8..79ca928672 100644 > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -454,7 +454,9 @@ static void __init _pci_hide_device(struct pci_dev *pdev) > if ( pdev->domain ) > return; > pdev->domain = dom_xen; > + write_lock(&dom_xen->pci_lock); > list_add(&pdev->domain_list, &dom_xen->pdev_list); > + write_unlock(&dom_xen->pci_lock); > } > > int __init pci_hide_device(unsigned int seg, unsigned int bus, > @@ -748,7 +750,9 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, > if ( !pdev->domain ) > { > pdev->domain = hardware_domain; > + write_lock(&hardware_domain->pci_lock); > list_add(&pdev->domain_list, &hardware_domain->pdev_list); > + write_unlock(&hardware_domain->pci_lock); > > /* > * For devices not discovered by Xen during boot, add vPCI handlers > @@ -758,7 +762,9 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, > if ( ret ) > { > printk(XENLOG_ERR "Setup of vPCI failed: %d\n", ret); > + write_lock(&hardware_domain->pci_lock); > list_del(&pdev->domain_list); > + write_unlock(&hardware_domain->pci_lock); > pdev->domain = NULL; > goto out; > } > @@ -766,7 +772,9 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, > if ( ret ) > { > vpci_remove_device(pdev); > + write_lock(&hardware_domain->pci_lock); > list_del(&pdev->domain_list); > + write_unlock(&hardware_domain->pci_lock); > pdev->domain = NULL; > goto out; > } > @@ -816,7 +824,11 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn) > pci_cleanup_msi(pdev); > ret = iommu_remove_device(pdev); > if ( pdev->domain ) > + { > + write_lock(&pdev->domain->pci_lock); > list_del(&pdev->domain_list); > + write_unlock(&pdev->domain->pci_lock); > + } > printk(XENLOG_DEBUG "PCI remove device %pp\n", &pdev->sbdf); > free_pdev(pseg, pdev); > break; > @@ -887,26 +899,61 @@ static int deassign_device(struct domain *d, uint16_t > seg, uint8_t bus, > > int pci_release_devices(struct domain *d) > { > - struct pci_dev *pdev, *tmp; > - u8 bus, devfn; > - int ret; > + int combined_ret; > + LIST_HEAD(failed_pdevs); > > pcidevs_lock(); > - ret = arch_pci_clean_pirqs(d); > - if ( ret ) > + > + combined_ret = arch_pci_clean_pirqs(d); > + if ( combined_ret ) > { > pcidevs_unlock(); > - return ret; > + return combined_ret; > } > - list_for_each_entry_safe ( pdev, tmp, &d->pdev_list, domain_list ) > + > + write_lock(&d->pci_lock); > + > + while ( !list_empty(&d->pdev_list) ) > { > - bus = pdev->bus; > - devfn = pdev->devfn; > - ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret; > + struct pci_dev *pdev = list_first_entry(&d->pdev_list, > + struct pci_dev, > + domain_list); > + uint16_t seg = pdev->seg; > + uint8_t bus = pdev->bus; > + uint8_t devfn = pdev->devfn; > + int ret; > + > + write_unlock(&d->pci_lock); > + ret = deassign_device(d, seg, bus, devfn); > + write_lock(&d->pci_lock); > + if ( ret ) > + { > + const struct pci_dev *tmp; > + > + /* > + * We need to check if deassign_device() left our pdev in > + * domain's list. As we dropped the lock, we can't be sure > + * that list wasn't permutated in some random way, so we > + * need to traverse the whole list. > + */ > + for_each_pdev ( d, tmp ) > + { > + if ( tmp == pdev ) > + { > + list_move_tail(&pdev->domain_list, &failed_pdevs); > + break; > + } > + } > + > + combined_ret = combined_ret ?: ret; > + } > } > + > + list_splice(&failed_pdevs, &d->pdev_list); > + write_unlock(&d->pci_lock); > pcidevs_unlock(); > > - return ret; > + return combined_ret; > } > > #define PCI_CLASS_BRIDGE_HOST 0x0600 > @@ -1125,7 +1172,9 @@ static int __hwdom_init cf_check > _setup_hwdom_pci_devices( > if ( !pdev->domain ) > { > pdev->domain = ctxt->d; > + write_lock(&ctxt->d->pci_lock); > list_add(&pdev->domain_list, &ctxt->d->pdev_list); > + write_unlock(&ctxt->d->pci_lock); > setup_one_hwdom_device(ctxt, pdev); > } > else if ( pdev->domain == dom_xen ) > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 0e3062c820..3228900c97 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2806,7 +2806,14 @@ static int cf_check reassign_device_ownership( > > if ( devfn == pdev->devfn && pdev->domain != target ) > { > - list_move(&pdev->domain_list, &target->pdev_list); > + write_lock(&source->pci_lock); > + list_del(&pdev->domain_list); > + write_unlock(&source->pci_lock); > + > + write_lock(&target->pci_lock); > + list_add(&pdev->domain_list, &target->pdev_list); > + write_unlock(&target->pci_lock); > + > pdev->domain = target; Same comment as in reassign_device() above. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |