[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v8 01/13] pci: introduce per-domain PCI rwlock
Hi Roger, Roger Pau Monné <roger.pau@xxxxxxxxxx> writes: > On Thu, Jul 20, 2023 at 12:32:31AM +0000, Volodymyr Babchuk wrote: >> Add per-domain d->pci_lock that protects access to >> d->pdev_list. Purpose of this lock is to give guarantees to VPCI code >> that underlying pdev will not disappear under feet. This is a rw-lock, >> but this patch adds only write_lock()s. There will be read_lock() >> users in the next patches. >> >> This lock should be taken in write mode every time d->pdev_list is >> altered. This covers both accesses to d->pdev_list and accesses to >> pdev->domain_list fields. All write accesses also should be protected >> by pcidevs_lock() as well. Idea is that any user that wants read >> access to the list or to the devices stored in the list should use >> either this new d->pci_lock or old pcidevs_lock(). Usage of any of >> this two locks will ensure only that pdev of interest will not >> disappear from under feet and that the pdev still will be assigned to >> the same domain. Of course, any new users should use pcidevs_lock() >> when it is appropriate (e.g. when accessing any other state that is >> protected by the said lock). > > I think this needs a note about the ordering: > > "In case both the newly introduced per-domain rwlock and the pcidevs > lock is taken, the later must be acquired first." Thanks. Added. >> >> Any write access to pdev->domain_list should be protected by both >> pcidevs_lock() and d->pci_lock in the write mode. > > You also protect calls to vpci_remove_device() with the per-domain > pci_lock it seems, and that will need some explanation as it's not > obvious. Well, strictly speaking, it is not required in this patch. But it is needed in the next one. I can lock only "list_del(&pdev->domain_list);" end extend then locked area in the next patch. On other hand, this patch already protects vpci_add_handlers() call in the pci_add_device() due to the code layout, so it may be natural to protect vpci_remove_device() as well. What is your opinion? >> >> Suggested-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> Suggested-by: Jan Beulich <jbeulich@xxxxxxxx> >> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> >> >> --- >> >> Changes in v8: >> - New patch >> >> Changes in v8 vs RFC: >> - Removed all read_locks after discussion with Roger in #xendevel >> - pci_release_devices() now returns the first error code >> - extended commit message >> - added missing lock in pci_remove_device() >> - extended locked region in pci_add_device() to protect list_del() calls >> --- >> xen/common/domain.c | 1 + >> xen/drivers/passthrough/amd/pci_amd_iommu.c | 9 ++- >> xen/drivers/passthrough/pci.c | 68 +++++++++++++++++---- >> xen/drivers/passthrough/vtd/iommu.c | 9 ++- >> xen/include/xen/sched.h | 1 + >> 5 files changed, 74 insertions(+), 14 deletions(-) >> >> diff --git a/xen/common/domain.c b/xen/common/domain.c >> index caaa402637..5d8a8836da 100644 >> --- a/xen/common/domain.c >> +++ b/xen/common/domain.c >> @@ -645,6 +645,7 @@ struct domain *domain_create(domid_t domid, >> >> #ifdef CONFIG_HAS_PCI >> INIT_LIST_HEAD(&d->pdev_list); >> + rwlock_init(&d->pci_lock); >> #endif >> >> /* All error paths can depend on the above setup. */ >> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c >> b/xen/drivers/passthrough/amd/pci_amd_iommu.c >> index 94e3775506..e2f2e2e950 100644 >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c >> @@ -476,8 +476,13 @@ static int cf_check reassign_device( >> >> if ( devfn == pdev->devfn && pdev->domain != target ) >> { >> - list_move(&pdev->domain_list, &target->pdev_list); >> - pdev->domain = target; > > You seem to have inadvertently dropped the above line? (and so devices > would keep the previous pdev->domain value) > Oops, yes. Thank you. I was testing those patches on Intel machine, so AMD part left not verified. >> + write_lock(&pdev->domain->pci_lock); >> + list_del(&pdev->domain_list); >> + write_unlock(&pdev->domain->pci_lock); >> + >> + write_lock(&target->pci_lock); >> + list_add(&pdev->domain_list, &target->pdev_list); >> + write_unlock(&target->pci_lock); >> } >> >> /* >> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c >> index 95846e84f2..5b4632ead2 100644 >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -454,7 +454,9 @@ static void __init _pci_hide_device(struct pci_dev *pdev) >> if ( pdev->domain ) >> return; >> pdev->domain = dom_xen; >> + write_lock(&dom_xen->pci_lock); >> list_add(&pdev->domain_list, &dom_xen->pdev_list); >> + write_unlock(&dom_xen->pci_lock); >> } >> >> int __init pci_hide_device(unsigned int seg, unsigned int bus, >> @@ -747,6 +749,7 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, >> ret = 0; >> if ( !pdev->domain ) >> { >> + write_lock(&hardware_domain->pci_lock); >> pdev->domain = hardware_domain; >> list_add(&pdev->domain_list, &hardware_domain->pdev_list); >> >> @@ -760,6 +763,7 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, >> printk(XENLOG_ERR "Setup of vPCI failed: %d\n", ret); >> list_del(&pdev->domain_list); >> pdev->domain = NULL; >> + write_unlock(&hardware_domain->pci_lock); > > Strictly speaking, this could move one line earlier, as accesses to > pdev->domain are not protected by the d->pci_lock? Same in other > instances (above and below), as you seem to introduce a pattern to > perform accesses to pdev->domain with the rwlock taken. > Yes, you are right. I'll move the unlock() call. >> goto out; >> } >> ret = iommu_add_device(pdev); >> @@ -768,8 +772,10 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, >> vpci_remove_device(pdev); >> list_del(&pdev->domain_list); >> pdev->domain = NULL; >> + write_unlock(&hardware_domain->pci_lock); >> goto out; >> } >> + write_unlock(&hardware_domain->pci_lock); >> } >> else >> iommu_enable_device(pdev); >> @@ -812,11 +818,13 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn) >> list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list ) >> if ( pdev->bus == bus && pdev->devfn == devfn ) >> { >> + write_lock(&pdev->domain->pci_lock); >> vpci_remove_device(pdev); >> pci_cleanup_msi(pdev); >> ret = iommu_remove_device(pdev); >> if ( pdev->domain ) >> list_del(&pdev->domain_list); >> + write_unlock(&pdev->domain->pci_lock); > > Here you seem to protect more than strictly required, I would think > only the list_del() would need to be done holding the rwlock? > Yes, I believe this is a spill from a next patch. At first all those changes were introduced in "vpci: use per-domain PCI lock to protect vpci structure", but then I decided to split changes into two patches. >> printk(XENLOG_DEBUG "PCI remove device %pp\n", &pdev->sbdf); >> free_pdev(pseg, pdev); >> break; >> @@ -887,26 +895,62 @@ static int deassign_device(struct domain *d, uint16_t >> seg, uint8_t bus, >> >> int pci_release_devices(struct domain *d) >> { >> - struct pci_dev *pdev, *tmp; >> - u8 bus, devfn; >> - int ret; >> + int combined_ret; >> + LIST_HEAD(failed_pdevs); >> >> pcidevs_lock(); >> - ret = arch_pci_clean_pirqs(d); >> - if ( ret ) >> + write_lock(&d->pci_lock); >> + combined_ret = arch_pci_clean_pirqs(d); > > Why do you need the per-domain rwlock for arch_pci_clean_pirqs()? > That function doesn't modify the per-domain pdev list. You are right, I will correct this in the next version. > >> + if ( combined_ret ) >> { >> pcidevs_unlock(); >> - return ret; >> + write_unlock(&d->pci_lock); >> + return combined_ret; > > Ideally we would like to keep the same order on unlock, so the rwlock > should be released before the pcidevs lock (unless there's a reason > not to). I'll move write_lock() further below, so this will be fixed automatically. > >> } >> - list_for_each_entry_safe ( pdev, tmp, &d->pdev_list, domain_list ) >> + >> + while ( !list_empty(&d->pdev_list) ) >> { >> - bus = pdev->bus; >> - devfn = pdev->devfn; >> - ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret; >> + struct pci_dev *pdev = list_first_entry(&d->pdev_list, >> + struct pci_dev, >> + domain_list); >> + uint16_t seg = pdev->seg; >> + uint8_t bus = pdev->bus; >> + uint8_t devfn = pdev->devfn; >> + int ret; >> + >> + write_unlock(&d->pci_lock); >> + ret = deassign_device(d, seg, bus, devfn); >> + write_lock(&d->pci_lock); >> + if ( ret ) >> + { >> + bool still_present = false; >> + const struct pci_dev *tmp; >> + >> + /* >> + * We need to check if deassign_device() left our pdev in >> + * domain's list. As we dropped the lock, we can't be sure >> + * that list wasn't permutated in some random way, so we >> + * need to traverse the whole list. >> + */ >> + for_each_pdev ( d, tmp ) >> + { >> + if ( tmp == pdev ) >> + { >> + still_present = true; >> + break; >> + } >> + } >> + if ( still_present ) >> + list_move(&pdev->domain_list, &failed_pdevs); > > You can get rid of the still_present variable, and just do: > > for_each_pdev ( d, tmp ) > if ( tmp == pdev ) > { > list_move(&pdev->domain_list, &failed_pdevs); > break; > } > > Yep, thanks. -- WBR, Volodymyr
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |