[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 2/6] xen: pci: introduce reference counting for pdev


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Fri, 21 Apr 2023 13:02:33 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qUyX2dYvxqrGMER1SUiKatz+f2i6jGdC1T0t6ZOvL+8=; b=B3emrQGUJj8WIWBcCt8pPkdXjdhifbPUombIDyCR5hBW4e+D2z+RJRG1qOyn38q3wmhHsfnZ7w9W99+5M8NKM1ZHoKFF/esoCr53P622dz+8H/+5Zm0va33JqW/ef1AdDj5Dv4C33fBrGucX1X8E4zTtA5vSfcMvP4OkpdHoffkZ9VBxwkxWpNTzNuXl8e5oRY+Cw/datVHQOxXNc9XVu3Rbtqvy1gDNsBpSHXpBVllBKjFkmvmqF+L1vx08U8+t5JIim1jQHF4mJ7sv41qMJSIbxnaTTCFuQSeVl2t2xA9TWUZNoWiG1SCDL2Kx6hxDe7BhPIRAmEMwhWOZdNTUVw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dcc1i3mh3JNbsDdbYgryjrYaY4a9tf0et6CcHqQGaXpegtg6sSVfLd4V2vlYfmskhDbEG7caZwFdmbVRfKkINCz+ez6Z0WrKvJaqoArjghhScNng5cnWUndyV48rm8rGoLx2BcLD58AcKO1q5uxGEmqMHEfL2pyg50qXSc2P6gfTDEiiNTcrgoziVjXxwFGBtgcOUjwtEbTPYXfREVQ2G7EWIF0vgNEAhO6RbUtqDnQa4Be0hgaUzWMW6rPxOcrkOv1z/SQ+ngYXEoHCN98tFZ4aGaAqkXivCpD0YHPjaiMkYBzXqGNHfeMU3S8G0lKjnXskUnRLiYmD4EMbxdqZ8w==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 21 Apr 2023 13:03:04 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHZVrdzsdckomMx4kauxHkZQ597Iq79mAMAgClI2ACAAK/igIAAVGGAgAGe14CAAJvOAIAFXoKAgAAEr4CAAAS6AIAGQ32AgAAf0YCAAAVEAA==
  • Thread-topic: [PATCH v3 2/6] xen: pci: introduce reference counting for pdev

Hi Jan,

Jan Beulich <jbeulich@xxxxxxxx> writes:

> On 21.04.2023 13:00, Volodymyr Babchuk wrote:
>> 
>> Hello Roger,
>> 
>> Roger Pau Monné <roger.pau@xxxxxxxxxx> writes:
>> 
>>> On Mon, Apr 17, 2023 at 12:34:31PM +0200, Jan Beulich wrote:
>>>> On 17.04.2023 12:17, Roger Pau Monné wrote:
>>>>> On Fri, Apr 14, 2023 at 01:30:39AM +0000, Volodymyr Babchuk wrote:
>>>>>> Above I have proposed another view on this. I hope, it will work for
>>>>>> you. Just to reiterate, idea is to allow "harmless" refcounts to be left
>>>>>> after returning from pci_remove_device(). By "harmless" I mean that
>>>>>> owners of those refcounts will not try to access the physical PCI
>>>>>> device if pci_remove_device() is already finished.
>>>>>
>>>>> I'm not strictly a maintainer of this piece code, albeit I have an
>>>>> opinion.  I will like to also hear Jans opinion, since he is the
>>>>> maintainer.
>>>>
>>>> I'm afraid I can't really appreciate the term "harmless refcounts". Whoever
>>>> holds a ref is entitled to access the device. As stated before, I see only
>>>> two ways of getting things consistent: Either pci_remove_device() is
>>>> invoked upon dropping of the last ref,
>>>
>>> With this approach, what would be the implementation of
>>> PHYSDEVOP_manage_pci_remove?  Would it just check whether the pdev
>>> exist and either return 0 or -EBUSY?
>>>
>> 
>> Okay, I am preparing patches with the behavior you proposed. To test it,
>> I artificially set refcount to 2 and indeed PHYSDEVOP_manage_pci_remove
>> returned -EBUSY, which propagated to the linux driver. Problem is that
>> Linux driver can't do anything with this. It just displayed an error
>> message and removed device anyways. This is because Linux sends
>> PHYSDEVOP_manage_pci_remove in device_remove() call path and there is no
>> way to prevent the device removal. So, admin is not capable to try this
>> again.
>
> So maybe Linux'es issuing of the call needs moving elsewhere? Or we need
> a new sub-op, such that PHYSDEVOP_manage_pci_remove can remain purely a
> last-moment notification?

From Linux point of view, it already cleaned up all the device resources
and it is ready to hot-unplug the device. Xen PCI driver in Linux just
gets a notification that device is being removed.

BTW, xen_pciback (AKA pci_stub) driver in Linux tracks that device is
assigned to another domain, but all it can do is to loudly complain in
kernel log if device is being removed without being deassigned from
another domain.

>
>> As I workaround, I can create hypercall continuation in case if
>> pci_remove_device() returns -EBUSY. What is your opinion?
>
> How would that help? You'd then spin perhaps for hours or days ...

Are you implying the case when we increase refcounter when we assign a
PCI device to a domain? In this case yes, it is quite possible that we
will spin there for any arbitrary amount of time...

-- 
WBR, Volodymyr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.