[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 2/6] xen: pci: introduce reference counting for pdev
On 24.04.2023 16:15, Volodymyr Babchuk wrote: > > Hi Jan, > > Jan Beulich <jbeulich@xxxxxxxx> writes: > >> On 21.04.2023 16:13, Volodymyr Babchuk wrote: >>> >>> Hi Roger, >>> >>> Roger Pau Monné <roger.pau@xxxxxxxxxx> writes: >>> >>>> On Fri, Apr 21, 2023 at 11:00:23AM +0000, Volodymyr Babchuk wrote: >>>>> >>>>> Hello Roger, >>>>> >>>>> Roger Pau Monné <roger.pau@xxxxxxxxxx> writes: >>>>> >>>>>> On Mon, Apr 17, 2023 at 12:34:31PM +0200, Jan Beulich wrote: >>>>>>> On 17.04.2023 12:17, Roger Pau Monné wrote: >>>>>>>> On Fri, Apr 14, 2023 at 01:30:39AM +0000, Volodymyr Babchuk wrote: >>>>>>>>> Above I have proposed another view on this. I hope, it will work for >>>>>>>>> you. Just to reiterate, idea is to allow "harmless" refcounts to be >>>>>>>>> left >>>>>>>>> after returning from pci_remove_device(). By "harmless" I mean that >>>>>>>>> owners of those refcounts will not try to access the physical PCI >>>>>>>>> device if pci_remove_device() is already finished. >>>>>>>> >>>>>>>> I'm not strictly a maintainer of this piece code, albeit I have an >>>>>>>> opinion. I will like to also hear Jans opinion, since he is the >>>>>>>> maintainer. >>>>>>> >>>>>>> I'm afraid I can't really appreciate the term "harmless refcounts". >>>>>>> Whoever >>>>>>> holds a ref is entitled to access the device. As stated before, I see >>>>>>> only >>>>>>> two ways of getting things consistent: Either pci_remove_device() is >>>>>>> invoked upon dropping of the last ref, >>>>>> >>>>>> With this approach, what would be the implementation of >>>>>> PHYSDEVOP_manage_pci_remove? Would it just check whether the pdev >>>>>> exist and either return 0 or -EBUSY? >>>>>> >>>>> >>>>> Okay, I am preparing patches with the behavior you proposed. To test it, >>>>> I artificially set refcount to 2 and indeed PHYSDEVOP_manage_pci_remove >>>>> returned -EBUSY, which propagated to the linux driver. Problem is that >>>>> Linux driver can't do anything with this. It just displayed an error >>>>> message and removed device anyways. This is because Linux sends >>>>> PHYSDEVOP_manage_pci_remove in device_remove() call path and there is no >>>>> way to prevent the device removal. So, admin is not capable to try this >>>>> again. >>>> >>>> Ideally Linux won't remove the device, and then the admin would get to >>>> retry. Maybe the way the Linux hook is placed is not the best one? >>>> The hypervisor should be authoritative on whether a device can be >>>> removed or not, and hence PHYSDEVOP_manage_pci_remove returning an >>>> error (EBUSY or otherwise) shouldn't allow the device unplug in Linux >>>> to continue. >>> >>> Yes, it would be ideally, but Linux driver/device model is written in a >>> such way, that PCI subsystem tracks all the PCI device usage, so it can >>> be certain that it can remove the device. Thus, functions in the device >>> removal path either return void or 0. Of course, kernel does not know that >>> hypervisor has additional uses for the device, so there is no mechanisms >>> to prevent removal. >> >> Could pciback obtain a reference on behalf of the hypervisor, dropping it >> when device removal is requested (i.e. much closer to the start of that >> operation), and only if it finds that no guests use the device anymore? > > Yes, it can, it this indeed will hold a reference to a pci device for a > time, but there are some consideration that made this approach not > feasible. > > Basically, when an user writes to /sys/bus/pci/SBDF/remove, the > following happens: > > 1. /sys/bus/pci/SBFD/remove entry is removed - we can't retry the > operation anymore Looking at the comment ahead of pci_stop_and_remove_bus_device(), isn't this too late already. The text there says "has been removed", not e.g. "is about to be removed". (Of course chances are that it is the comment which is wrong; I know too little about Linux'es hot- unplug machinery.) Jan > [unimportant things] > > N. pci_stop_dev() function is called. This function unloads a device > driver. Any good behaving driver should drop all additional references > to a device at this point. > > [more unimportant things] > > M. PCI subsystem drops own reference to a generic device object > > So, as you can see, admin can't restart a "failed" attempt to remove a > PCI device. > > On other hand, remove() function can sleep. This allows us to pause > removal process a bit and check if hypervisor had finished removing a > PCI device on its side. But, as you pointed out, this can take weeks... >
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |