Xen project Mailing List

Re: [RFC PATCH 08/10] xen: pci: remove pcidev_[un]lock[ed] calls

From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>

Date: Thu, 9 Mar 2023 01:22:18 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Oqy12t0RQd/RNwTswnsTspjMP6itjiRNUtrLGauzMis=; b=U5HfksPQX+IP3BBCc0k6OCLQRu/uTnLCPri+u88uLEOa99DnqEqs4heg4pA71OC5Hj+K6hN6y48owd2w9zsMtS/pTcybgs2LUA0NlUvVtPbUkq7rT2gk4H0g5TkojhfiRRwR6u9BdsHVBK2NRoQ1veniMQ7QtPYM66K1hoTg8HsJ9XqmJN1rnEIFhW3HqEXbdynXAKBuJqy/3QdjRVIyLKS3ezpseFatbXqWh6fGTm4CJZm0x70IRldNpSpCDGo5QgGqvPQ+WYEQXIIVMFkzY6C5R+UnGCF4m+j7SQJJH1BN5uS7XPNPn+j+R5eY8pWqGxp2Hl1lLQPpeEBsMPWPBQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EiA1fO0Rb7mg7qu37XFKckMLAjH+Mqp7AVBevBgQfsNgB9/B36mF/dqyqxVVj3oGuwiGmqmdUGFpYyKo9G3vrR/nqtHFlpoWzE70EuAurA+qGbUURevKdMCKTmQ5SbKj+LSqOeMWzlLdAzZf87cVmRcaTP+0MCExxTaoDOZ7NMQ5xxqnITw5PWzHPzwZ4LZ/fTA3Y4HN1or0MwQqCxLD7MtCdHQnhG7cde+nXFhnH+He5939sVEAr3r/7XSPBPYAB5UksMmq9TpplVp7JsUB+fACusTn8mdr0sqli7VjS07B2UeHnBWNP3bBJ9umn68Xv4g0YbzO10G+FJh/2p4hDw==

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Paul Durrant <paul@xxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>

Delivery-date: Thu, 09 Mar 2023 01:23:03 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHYvUN/BRbyNewJG0OqhFjBUWUt7a6z9h8AgCWO7ACAALRSAIAYk4mA

Thread-topic: [RFC PATCH 08/10] xen: pci: remove pcidev_[un]lock[ed] calls

Hello Jan, Jan Beulich <jbeulich@xxxxxxxx> writes: > On 21.02.2023 00:13, Volodymyr Babchuk wrote: >> Stefano Stabellini <sstabellini@xxxxxxxxxx> writes: >>> On Wed, 31 Aug 2022, Volodymyr Babchuk wrote: >>>> As pci devices are refcounted now and all list that store them are >>>> protected by separate locks, we can safely drop global pcidevs_lock. >>>> >>>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> >>> >>> Up until this patch this patch series introduces: >>> - d->pdevs_lock to protect d->pdev_list >>> - pci_seg->alldevs_lock to protect pci_seg->alldevs_list >>> - iommu->ats_list_lock to protect iommu->ats_devices >>> - pdev refcounting to detect a pdev in-use and when to free it >>> - pdev->lock to protect pdev->msi_list >>> >>> They cover a lot of ground. Are they collectively covering everything >>> pcidevs_lock() was protecting? >> >> Well, that is the question. Those patch are in RFC stage because I can't >> fully answer your question. I tried my best to introduce proper locking, >> but apparently missed couple of places, like >> >>> deassign_device is not protected by pcidevs_lock anymore. >>> deassign_device accesses a number of pdev fields, including quarantine, >>> phantom_stride and fault.count. >>> >>> deassign_device could run at the same time as assign_device who sets >>> quarantine and other fields. >>> >> >> I hope this is all, but problem is that PCI subsystem is old, large and >> complex. Fo example, as I wrote earlier, there are places that are >> protected with pcidevs_lock(), but do nothing with PCI. I just don't >> know what to do with such places. I have a hope that x86 maintainers >> would review my changes and give feedback on missed spots. > > At the risk of it sounding unfair, at least initially: While review may > spot issues, you will want to keep in mind that none of the people who > originally wrote that code are around anymore. And even if they were, > it would be uncertain - just like for the x86 maintainers - that they > would recall (if they were aware at some time in the first place) all > the corner cases. Therefore I'm afraid that proving correctness and > safety of the proposed transformations can only be done by properly > auditing all involved code paths. Yet that's something that imo wants > to already have been done by the time patches are submitted for review. > Reviewers would then "merely" (hard enough perhaps) check the results > of that audit. > > I might guess that this locking situation is one of the reasons why > Andrew in particular thinks (afaik) that the IOMMU code we have would > better be re-written almost from scratch. I assume it's clear to him > (it certainly is to me) that this is something that could only be > expected to happen in an ideal work: I see no-one taking on such an > exercise. We already have too little bandwidth. The more I dig into IOMMU code, the more I agree with Andrew. I can't see how current PCI locking can be untangled in the IOMMU code. There are just too many moving parts. I tried to play with static code analysis tools, but I haven't find anything that can reliably analyze locking in Xen. I even tried to write own tool tailored specifically for PCI locking analysis. While it works on some synthetic tests, there is too much work to support actual Xen code. I am not able to rework x86 IOMMU code. So, I am inclined to drop this patch series at all. My current plan is to take minimal refcounting from this series to satisfy your comments for "vpci: use pcidevs locking to protect MMIO handlers". -- WBR, Volodymyr

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.