[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 2/3] xen/pci: introduce PF<->VF links



On Mon, Nov 11, 2024 at 03:07:28PM -0500, Stewart Hildebrand wrote:
> On 10/28/24 14:41, Roger Pau Monné wrote:
> > On Fri, Oct 18, 2024 at 04:39:09PM -0400, Stewart Hildebrand wrote:
> > IOW: shouldn't the PF
> > always be added first, so that SR-IOV can be enabled and the VFs added
> > afterwards?
> 
> Yes, I think you're right.
> 
> > I see previous code also catered for VFs being added without the PF
> > being present, so I assume there was some need for this.
> 
> This is exactly the source of the confusion on my part. In the removal
> path, the consensus seems to be that removing a PF with VFs still
> present indicates an error. Then shouldn't the opposite also be true?
> Adding a VF without a PF should also indicate an error.
> 
> I see the PF-adding logic was added in 942a6f1376d8 ("x86/PCI-MSI:
> properly determine VF BAR values"). Searching the mailing list archives
> didn't reveal much about it [0].
> 
> [0] 
> https://lore.kernel.org/xen-devel/4E3FC6E102000078000501CA@xxxxxxxxxxxxxxxxxxxx/
> 
> The only time I've observed this path being taken is by manually
> calling PHYSDEVOP_pci_device_add. I've resorted to calling
> PHYSDEVOP_pci_device_{remove,add} from userspace in order to test this,
> because the Linux kernel doesn't behave this way.
> 
> I can't think of a good rationale for catering to VFs being added
> without a PF, so I'll turn it into an error for the next rev.

Maybe there's a case for a device to be discovered with SR-IOV already
enabled (ie: when booting in kexec like environments), but then I
would still expect the OS to first add the PF and afterwards the VFs.
Otherwise I'm not even sure whether the OS would be capable of
identifying the VFs as such.

> >> +                    ASSERT_UNREACHABLE();
> >> +                    free_pdev(pseg, pdev);
> >> +                    ret = -EILSEQ;
> >> +                    goto out;
> >> +                }
> >> +            }
> >> +
> >> +            pdev->info.is_extfn = pf_pdev->info.is_extfn;
> >> +            pdev->virtfn.pf_pdev = pf_pdev;
> >> +            list_add(&pdev->vf_list, &pf_pdev->vf_list);
> >> +        }
> >>      }
> >>  
> >>      if ( !pdev->info.is_virtfn && !pdev->vf_rlen[0] )
> >> @@ -821,6 +851,24 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> >>      list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
> >>          if ( pdev->bus == bus && pdev->devfn == devfn )
> >>          {
> >> +            if ( !pdev->info.is_virtfn )
> > 
> > Given we have no field to mark a device as a PF, we could check that
> > pdev->vf_list is not empty, and by doing so the warn_stale_vfs
> > variable could be dropped?
> > 
> > if ( !pdev->info.is_virtfn && !list_empty(&pdev->vf_list) )
> > {
> >     struct pci_dev *vf_pdev;
> > 
> >     while ( (vf_pdev = list_first_entry_or_null(&pdev->vf_list,
> >                                                 struct pci_dev,
> >                                             vf_list)) != NULL )
> >     {
> >         list_del(&vf_pdev->vf_list);
> >         vf_pdev->virtfn.pf_pdev = NULL;
> >         vf_pdev->broken = true;
> >     }
> > 
> >     printk(XENLOG_WARNING "PCI SR-IOV PF %pp removed with VFs still 
> > present\n",
> >            &pdev->sbdf);
> > }
> 
> Yeah. Given that the consensus is leaning toward keeping the PF and
> returning an error, here's my suggestion:
> 
>     if ( !pdev->info.is_virtfn && !list_empty(&pdev->vf_list) )
>     {
>         struct pci_dev *vf_pdev;
> 
>         list_for_each_entry(vf_pdev, &pdev->vf_list, vf_list)
>             vf_pdev->broken = true;
> 
>         pdev->broken = true;

Do you need to mark the devices as broken?  My expectation would be
that returning -EBUSY here should prevent the device from being
removed, and hence there would be no breakage, just failure to fulfill
the (possible) hot-unplug request.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.