[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 2/3] xen/pci: introduce PF<->VF links
On 10/28/24 14:41, Roger Pau Monné wrote: > On Fri, Oct 18, 2024 at 04:39:09PM -0400, Stewart Hildebrand wrote: >> Add links between a VF's struct pci_dev and its associated PF struct >> pci_dev. Move the calls to pci_get_pdev()/pci_add_device() down to avoid >> dropping and re-acquiring the pcidevs_lock(). >> >> During PF removal, unlink VF from PF and mark the VF broken. As before, >> VFs may exist without a corresponding PF, although now only with >> pdev->broken = true. >> >> The hardware domain is expected to remove the associated VFs before >> removing the PF. Print a warning in case a PF is removed with associated >> VFs still present. >> >> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@xxxxxxx> >> --- >> Candidate for backport to 4.19 (the next patch depends on this one) >> >> v5->v6: >> * move printk() before ASSERT_UNREACHABLE() >> * warn about PF removal with VFs still present >> * clarify commit message >> >> v4->v5: >> * new patch, split from ("x86/msi: fix locking for SR-IOV devices") >> * move INIT_LIST_HEAD(&pdev->vf_list); earlier >> * collapse struct list_head instances >> * retain error code from pci_add_device() >> * unlink (and mark broken) VFs instead of removing them >> * const-ify VF->PF link >> --- >> xen/drivers/passthrough/pci.c | 76 ++++++++++++++++++++++++++++------- >> xen/include/xen/pci.h | 10 +++++ >> 2 files changed, 72 insertions(+), 14 deletions(-) >> >> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c >> index 74d3895e1ef6..fe31255b1207 100644 >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -333,6 +333,8 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, >> u8 bus, u8 devfn) >> *((u8*) &pdev->devfn) = devfn; >> pdev->domain = NULL; >> >> + INIT_LIST_HEAD(&pdev->vf_list); >> + >> arch_pci_init_pdev(pdev); >> >> rc = pdev_msi_init(pdev); >> @@ -449,6 +451,10 @@ static void free_pdev(struct pci_seg *pseg, struct >> pci_dev *pdev) >> >> list_del(&pdev->alldevs_list); >> pdev_msi_deinit(pdev); >> + >> + if ( pdev->info.is_virtfn && pdev->virtfn.pf_pdev ) > > Shouldn't having pdev->info.is_virtfn set already ensure that > pdev->virtfn.pf_pdev != NULL? In the current rev, the possibility exists, however unlikely, that a *buggy* dom0 may remove the PF before removing the VFs. In this case a VF would exist without a corresponding PF, and thus pdev->virtfn.pf_pdev is NULL. For the next rev, you're right, it'll be back to the situation where a VF can only exist with an associated PF. >> + list_del(&pdev->vf_list); >> + >> xfree(pdev); >> } >> >> @@ -656,24 +662,11 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, >> unsigned int slot = PCI_SLOT(devfn), func = PCI_FUNC(devfn); >> const char *type; >> int ret; >> - bool pf_is_extfn = false; >> >> if ( !info ) >> type = "device"; >> else if ( info->is_virtfn ) >> - { >> - pcidevs_lock(); >> - pdev = pci_get_pdev(NULL, >> - PCI_SBDF(seg, info->physfn.bus, >> - info->physfn.devfn)); >> - if ( pdev ) >> - pf_is_extfn = pdev->info.is_extfn; >> - pcidevs_unlock(); >> - if ( !pdev ) >> - pci_add_device(seg, info->physfn.bus, info->physfn.devfn, >> - NULL, node); >> type = "virtual function"; >> - } >> else if ( info->is_extfn ) >> type = "extended function"; >> else >> @@ -703,7 +696,44 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, >> * extended function. >> */ >> if ( pdev->info.is_virtfn ) >> - pdev->info.is_extfn = pf_is_extfn; >> + { >> + struct pci_dev *pf_pdev; > > This could be const? No, as we are doing this below: list_add(&pdev->vf_list, &pf_pdev->vf_list); >> + >> + pf_pdev = pci_get_pdev(NULL, >> + PCI_SBDF(seg, info->physfn.bus, >> + info->physfn.devfn)); > > You can probably initialize at declaration? OK >> + >> + if ( !pf_pdev ) > > Is this even feasible during correct operation? No, I don't think so. > IOW: shouldn't the PF > always be added first, so that SR-IOV can be enabled and the VFs added > afterwards? Yes, I think you're right. > I see previous code also catered for VFs being added without the PF > being present, so I assume there was some need for this. This is exactly the source of the confusion on my part. In the removal path, the consensus seems to be that removing a PF with VFs still present indicates an error. Then shouldn't the opposite also be true? Adding a VF without a PF should also indicate an error. I see the PF-adding logic was added in 942a6f1376d8 ("x86/PCI-MSI: properly determine VF BAR values"). Searching the mailing list archives didn't reveal much about it [0]. [0] https://lore.kernel.org/xen-devel/4E3FC6E102000078000501CA@xxxxxxxxxxxxxxxxxxxx/ The only time I've observed this path being taken is by manually calling PHYSDEVOP_pci_device_add. I've resorted to calling PHYSDEVOP_pci_device_{remove,add} from userspace in order to test this, because the Linux kernel doesn't behave this way. I can't think of a good rationale for catering to VFs being added without a PF, so I'll turn it into an error for the next rev. >> + { >> + ret = pci_add_device(seg, info->physfn.bus, >> info->physfn.devfn, >> + NULL, node); >> + if ( ret ) >> + { >> + printk(XENLOG_WARNING "Failed to add SR-IOV device PF >> %pp for VF %pp\n", > > Could you split this to make the line a bit shorter? > > printk(XENLOG_WARNING > "Failed to add SR-IOV device PF %pp for VF %pp\n", > > Same below. > >> + &PCI_SBDF(seg, info->physfn.bus, >> info->physfn.devfn), >> + &pdev->sbdf); >> + free_pdev(pseg, pdev); >> + goto out; >> + } >> + pf_pdev = pci_get_pdev(NULL, >> + PCI_SBDF(seg, info->physfn.bus, >> + info->physfn.devfn)); >> + if ( !pf_pdev ) >> + { >> + printk(XENLOG_ERR "Failed to find SR-IOV device PF %pp >> for VF %pp\n", >> + &PCI_SBDF(seg, info->physfn.bus, >> info->physfn.devfn), >> + &pdev->sbdf); > > If you want to add an error message here, I think it should mention > the fact this state is not expected: > > "Inconsistent PCI state: failed to find newly added PF %pp for VF %pp\n" > >> + ASSERT_UNREACHABLE(); >> + free_pdev(pseg, pdev); >> + ret = -EILSEQ; >> + goto out; >> + } >> + } >> + >> + pdev->info.is_extfn = pf_pdev->info.is_extfn; >> + pdev->virtfn.pf_pdev = pf_pdev; >> + list_add(&pdev->vf_list, &pf_pdev->vf_list); >> + } >> } >> >> if ( !pdev->info.is_virtfn && !pdev->vf_rlen[0] ) >> @@ -821,6 +851,24 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn) >> list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list ) >> if ( pdev->bus == bus && pdev->devfn == devfn ) >> { >> + if ( !pdev->info.is_virtfn ) > > Given we have no field to mark a device as a PF, we could check that > pdev->vf_list is not empty, and by doing so the warn_stale_vfs > variable could be dropped? > > if ( !pdev->info.is_virtfn && !list_empty(&pdev->vf_list) ) > { > struct pci_dev *vf_pdev; > > while ( (vf_pdev = list_first_entry_or_null(&pdev->vf_list, > struct pci_dev, > vf_list)) != NULL ) > { > list_del(&vf_pdev->vf_list); > vf_pdev->virtfn.pf_pdev = NULL; > vf_pdev->broken = true; > } > > printk(XENLOG_WARNING "PCI SR-IOV PF %pp removed with VFs still > present\n", > &pdev->sbdf); > } Yeah. Given that the consensus is leaning toward keeping the PF and returning an error, here's my suggestion: if ( !pdev->info.is_virtfn && !list_empty(&pdev->vf_list) ) { struct pci_dev *vf_pdev; list_for_each_entry(vf_pdev, &pdev->vf_list, vf_list) vf_pdev->broken = true; pdev->broken = true; printk(XENLOG_WARNING "Attempted to remove PCI SR-IOV PF %pp with VFs still present\n", &pdev->sbdf); ret = -EBUSY; break; } >> + { >> + struct pci_dev *vf_pdev, *tmp; >> + bool warn_stale_vfs = false; >> + >> + list_for_each_entry_safe(vf_pdev, tmp, &pdev->vf_list, >> vf_list) >> + { >> + list_del(&vf_pdev->vf_list); >> + vf_pdev->virtfn.pf_pdev = NULL; >> + vf_pdev->broken = true; >> + warn_stale_vfs = true; >> + } >> + >> + if ( warn_stale_vfs ) >> + printk(XENLOG_WARNING "PCI SR-IOV PF %pp removed with >> VFs still present\n", >> + &pdev->sbdf); >> + } >> + >> if ( pdev->domain ) >> { >> write_lock(&pdev->domain->pci_lock); >> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h >> index ef56e80651d6..2ea168d5f914 100644 >> --- a/xen/include/xen/pci.h >> +++ b/xen/include/xen/pci.h >> @@ -153,7 +153,17 @@ struct pci_dev { >> unsigned int count; >> #define PT_FAULT_THRESHOLD 10 >> } fault; >> + >> + /* >> + * List head if info.is_virtfn == false >> + * List entry if info.is_virtfn == true >> + */ >> + struct list_head vf_list; >> u64 vf_rlen[6]; >> + struct { >> + /* Only populated for VFs (info.is_virtfn == true) */ > > All comments here (specially the first ones) would better use PF and > VF consistently, rather than referring to other fields in the struct. > Specially because the fields can change names and the comments would > then become stale. OK >> + const struct pci_dev *pf_pdev; /* Link from VF to PF */ >> + } virtfn; > > I'm unsure you need an outer virtfn struct, as it's only one field in > this patch? Maybe more fields gets added by further patches? Right. There are no more fields to be added, so there's no need. It was leftover from a previous rev when vf_list was split. > > Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |