[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/msi: fix locking for SR-IOV devices
On 15.08.2024 03:28, Stewart Hildebrand wrote: > On 8/13/24 10:01, Jan Beulich wrote: >> On 12.08.2024 22:39, Stewart Hildebrand wrote: >>> In commit 4f78438b45e2 ("vpci: use per-domain PCI lock to protect vpci >>> structure") a lock moved from allocate_and_map_msi_pirq() to the caller >>> and changed from pcidevs_lock() to read_lock(&d->pci_lock). However, one >>> call path wasn't updated to reflect the change, leading to a failed >>> assertion observed under the following conditions: >>> >>> * PV dom0 >>> * Debug build (debug=y) of Xen >>> * There is an SR-IOV device in the system with one or more VFs enabled >>> * Dom0 has loaded the driver for the VF and enabled MSI-X >>> >>> (XEN) Assertion 'd || pcidevs_locked()' failed at >>> drivers/passthrough/pci.c:535 >>> (XEN) ----[ Xen-4.20-unstable x86_64 debug=y Not tainted ]---- >>> ... >>> (XEN) Xen call trace: >>> (XEN) [<ffff82d040284da8>] R pci_get_pdev+0x4c/0xab >>> (XEN) [<ffff82d040344f5c>] F arch/x86/msi.c#read_pci_mem_bar+0x58/0x272 >>> (XEN) [<ffff82d04034530e>] F >>> arch/x86/msi.c#msix_capability_init+0x198/0x755 >>> (XEN) [<ffff82d040345dad>] F arch/x86/msi.c#__pci_enable_msix+0x82/0xe8 >>> (XEN) [<ffff82d0403463e5>] F pci_enable_msi+0x3f/0x78 >>> (XEN) [<ffff82d04034be2b>] F map_domain_pirq+0x2a4/0x6dc >>> (XEN) [<ffff82d04034d4d5>] F allocate_and_map_msi_pirq+0x103/0x262 >>> (XEN) [<ffff82d04035da5d>] F physdev_map_pirq+0x210/0x259 >>> (XEN) [<ffff82d04035e798>] F do_physdev_op+0x9c3/0x1454 >>> (XEN) [<ffff82d040329475>] F pv_hypercall+0x5ac/0x6af >>> (XEN) [<ffff82d0402012d3>] F lstar_enter+0x143/0x150 >>> >>> In read_pci_mem_bar(), the VF obtains the struct pci_dev pointer for its >>> associated PF to access the vf_rlen array. This array is initialized in >>> pci_add_device() and is only populated in the associated PF's struct >>> pci_dev. >>> >>> Add a link from the VF's struct pci_dev to the associated PF struct >>> pci_dev, ensuring the PF's struct doesn't get deallocated until all its >>> VFs have gone away. Access the vf_rlen array via the new link to the PF, >>> and remove the troublesome call to pci_get_pdev(). >>> >>> Add a call to pci_get_pdev() inside the pcidevs_lock()-locked section of >>> pci_add_device() to set up the link from VF to PF. In case the new >>> pci_add_device() invocation fails to find the associated PF (returning >>> NULL), we are no worse off than before: read_pci_mem_bar() will still >>> return 0 in that case. >>> >>> Note that currently the only way for Xen to know if a device is a VF is >>> if the toolstack tells Xen about it. Using PHYSDEVOP_manage_pci_add for >>> a VF is not a case that Xen handles. >> >> How does the toolstack come into play here? It's still the Dom0 kernel to >> tell Xen, via PHYSDEVOP_pci_device_add (preferred) or >> PHYSDEVOP_manage_pci_add_ext (kind of deprecated; PHYSDEVOP_manage_pci_add >> is even more kind of deprecated). > > I guess I meant to say Dom0 kernel, not toolstack. I'm actually > questioning how much value this last portion of the commit description > is really adding. Maybe it would be better to just remove this bit. +1 >>> @@ -446,7 +448,27 @@ static void free_pdev(struct pci_seg *pseg, struct >>> pci_dev *pdev) >>> >>> list_del(&pdev->alldevs_list); >>> pdev_msi_deinit(pdev); >>> - xfree(pdev); >>> + >>> + if ( pdev->info.is_virtfn ) >>> + { >>> + struct pci_dev *pf_pdev = pdev->virtfn.pf_pdev; >>> + >>> + if ( pf_pdev ) >>> + { >>> + list_del(&pdev->virtfn.next); >>> + if ( pf_pdev->physfn.dealloc_pending && >>> + list_empty(&pf_pdev->physfn.vf_list) ) >>> + xfree(pf_pdev); >>> + } >>> + xfree(pdev); >>> + } >>> + else >>> + { >>> + if ( list_empty(&pdev->physfn.vf_list) ) >>> + xfree(pdev); >>> + else >>> + pdev->physfn.dealloc_pending = true; >>> + } >> >> Could I talk you into un-indenting the last if/else by a level, by making >> the earlier else and "else if()"? >> >> One aspect I didn't properly consider when making the suggestion: What if, >> without all VFs having gone away, the PF is re-added? In that case we >> would better recycle the existing structure. That's getting a little >> complicated, so maybe a better approach is to refuse the request (in >> pci_remove_device()) when the list isn't empty? > > Hm. Right. The growing complexity of maintaining these PF<->VF links > (introduced on a hunch that they may be useful in the future) is making > me favor the previous approach from v2 of simply copying the vf_len > array. So unless there are any objections I'll go back to that approach > for v4. I continue to consider the earlier approach at least undesirable. The vf_len[] array shouldn't be copied around, risking for it to be / go stale. There should be one central place for every bit of information, unless there are very good reasons to duplicate something. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |