[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] vpci: introduce per-domain lock to protect vpci structure


  • To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 15 Feb 2022 12:39:31 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OHohejBKFp1uIYSaKF+87manGwACtmUZAC6ONm0/1J8=; b=GINDLA+5InJ7mmI6lCUjd0IEDMEllK3COrkQN6l4UbimSRVH+1P47PwsJ7+aZulmeLKfSkid9i+vqA0V/6pDtX9UfYk4UidqxmDLLtvF2EntqVMB9VdF1BLxfClnEMNzgsabGgcyd6sVmi1yx5Sd6Mjoe+Z7B7lAHsG9TdWXkZKti11zMLRGoM27JucbsuLYlxZDxIev+bnRAf8x6w1oEFCBTRUTfbslcWWG0J4HTbpvO08y6jkgoec213ji5cD/PFgQyCocdit1MH/4HajfuDScqGSi0DhkeoEeyiTOqTwzW8IsL16sWtbBjQMSRVykQ04tSOkGXbzpbti5H7Wq3Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Db9oXkxLMixTr4fd4bkFUTCtMVxpHJt1MgRErMtCp3FVD8cN4e4wyr6ewSnpzadSCDvUHaUuUnJPTbtyMUAdtXDv6cbLKYowBPSFhXQVGHYRmqodFX04Am9XDtp7MaEGEDhsbE7NyhhctOKhRQmvs16zckaY9W3Uz4c39qgHMZtWbzxZi8RON7Gia8sKXEmggtdn3NQ+6Si097i0ikqoYc/1MHNl3OZyO4ByMcla3YRyBWXI5TbqzB9jcIQXgUMdyCIWxwU5UV3yV+qhYMhnTaHr1FT5lqBJGcW4d2qXtXFvDJo3VaWQrOmDJsd9EfcKUMyfPXaqVZ16nDZ57W4XLw==
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>
  • Delivery-date: Tue, 15 Feb 2022 11:39:51 +0000
  • Ironport-data: A9a23:AMRbRKu+vSv1X9DIcMOrbDyxSOfnVF5YMUV32f8akzHdYApBsoF/q tZmKWjQaKmLYjSkeo1xOY2woBhQv8fUmtU1HVBvqHgyFiMb+JbJXdiXEBz9bniYRiHhoOOLz Cm8hv3odp1coqr0/0/1WlTZQP0VOZigHtIQMsadUsxKbVIiGHdJZS5LwbZj2NYy2IXhWWthh PupyyHhEA79s9JLGjp8B5Kr8HuDa9yr5Vv0FnRnDRx6lAe2e0s9VfrzFonoR5fMeaFGH/bSe gr25OrRElU1XfsaIojNfr7TKiXmS1NJVOSEoiI+t6OK2nCuqsGuu0qS2TV1hUp/0l20c95NJ NpltrGXcER3ALDwpL4sSDBpTy15Eo1B5+qSSZS/mZT7I0zudnLtx7NlDV0sPJ1e8eFyaY1M3 aVGcnZXNEnF3r/ohuLgIgVvrp1LwM3DJoQQt2sm1TjEJf0nXYrCU+PB4towMDIY2J4SQ6yHP 5pxhTxHNQ/gWzhGZk0tJZM0gtinlGXTKm1FkQfAzUYwyzeKl1EguFT3C/LrfdiNSdRQj1yvj GvM9GTkATkXLNWajzGC9xqEoevCnjjyXo4II4Gp7f5hgFCVxWs7BQUfUB2wpvzRomekR99aH GkF9SMvoLYa+VSiS5/2WBjQiGSNvgMYHcFRFeI6wAiXz+zf5APxLmIJVCJbYdoq8so/XyU31 0ShlsnsQzdotdW9WX+bs7uZsz62ESwUNnMZIz8JSxMf5Nvuq511iQjAJv5EFKO2ldTzFSvH6 jaGtjUlh74TgMgI0I225VnCxTmro/DhTAQ4+wHWVWKN9R5iaciuYInAwUPA8f9KIYKdT1+Al HsJgc6T6KYJF57lvC6QROQAGpm56vDDNyfT6WODBLF4qW7roST6O9kNvncufy+FL/roZxfqT Wjq+l1v+aNvZkqxRJ4rb4WeK8M1mP2I+cveatjYad9HY55UfQCB/T1zaUP4410BgHTAgolkZ 87FLJ/E4WIyTP0+kWHoH7t1PaoDm3hmrV4/U6wX2PhOPVC2QHeOAYkIP1KVBgzSxPPV+V6Fm zqz2ibj9vm+bAEcSnSNmWLwBQpTRZTeOXwRg5YJHtNv2iI8RAkc5wb5mNvNgbBNkaVPjfvv9 XqgQEJew1eXrSSZdVnVOiA+Neu2BswXQZcH0csEZwjA5pTeSdz3sPd3m2UfIdHLC9CPPdYrF qJYKq1s89xESyjd+iR1UHUOhNcKSfhfvirXZ3DNSGFmJ/ZIHlWVkve5LlqH3HRfVUKf6Jphy 4BMIyuGGPLvsSw5V52IAB9upnvs1UUgdBVaBhCWfIALIR+0r+CH6UXZ15cKHi3FEj2arhOy3 AeKGxYI4+7Lpo4+6t7Sgq6Y6YyuFoND8oByQwE3NJ66anvX+HSN24hFXLradDzRTjqsqq6je f9U37f3N/hexARGtI91ErBKy6Mi5oSw++8Gn1o8RHibPU62Dr5AI2Wd2ZUdvKN62bIE6xC9X ViC+4cGNOzRat/lClMYOCEscv+HiaMPgjDX4PlseBf66SZ78aCpS0JXOxXQ2iVRIKEsaNEuw Ps7ud5Q4Au600J4PtGDhyFS1mKNMn1fDPl36sBEWNfm01N5xEtDbJrQDj7NzKuOM9gcYFM3J jK0hbbZg+gOzET1bHduR2PG2vBQhMpStUkSnkMCPVmAhvHMmuQzgE9K6T0yQwlYkkdH3uZ0N jQ5PkF5P/zTrTJhhcwFVGGwAQBRQhae/xWpmVcOkWTYSWiuV3DMczJhabrcohhB/jIOZCVf8 ZGZ1H3hAGTjc8zG1ycvXVJo9q74Rttr+wyewM2qEqxpxXXhjeYJVkN2WVc1lg==
  • Ironport-hdrordr: A9a23:E+yarqxZRHUuNyQFcOWIKrPxtOskLtp133Aq2lEZdPULSKOlfp GV8MjziyWYtN9wYhAdcdDpAtjmfZr5z+8O3WB3B8beYOCGghrSEGgG1+XfKlLbak/DH4JmpM Jdmu1FeaHN5DtB/LfHCWuDYq8dKbC8mcjC74eurEuFDzsaE52Ihz0JdDpzeXcGIjWua6BJcK Z1saF81kWdkDksH4+GL0hAe9KGi8zAlZrgbxJDLxk76DOWhTftzLLhCRCX0joXTjsKmN4ZgC X4uj28wp/mn+Cwyxfa2WOWx5NKmOH5wt8GIMCXkMAaJhjllw7tToV8XL+puiwzvYiUmRwXue iJhy1lE9V46nvXcG3wiRzx2zP42DJr0HPmwU/wuwqUneXJABYBT+ZRj4NQdRXUr2A6ustn7a 5N12WF87JKEBLphk3Glpb1fiAvsnDxjWspkOYVgXAae5AZcqVtoYsW+14QOIscHRj99JssHI BVfY7hDc5tABOnhk3izypSKITGZAVwIv7GeDlPhiWt6UkWoJgjpHFogfD2nR87heUAotd/lq D5259T5cJzp/ktHNZA7dc6MLuK41P2MGDx2UKpUB3a/fI8SjrwQ6Ce2sRB2AjtQu1O8KcP
  • Ironport-sdr: ApMsiIAStKa0/G+hG3CGxy750D7lPymXhNp5kM61EPn3/vchRXL7kSIMmADuhpyxv1eJYHPaj6 +FSajBBGUBhm2P2gQjf2DCiCD7sjBddEd+xkpr/ntH9PEj9pRBPJaXAk6iGESq3nNZfH5IJouc ZVT/7kkFFlmn8y1jPlcCk4Y4J9KEBqnDH7KDucCLpOIWPKVPD5SSN98EjMwaBg2XjYJEFXECFa Zk5lhNdS8Dpcl1mya4DJprnt3PkpyTGfZNzdOnH1jCqnvRTj9HJT+G29dFzaDa9AbafHXpM+gu 5ZrACyA8zJ0c7wFHa5skKcIe
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Feb 15, 2022 at 11:12:23AM +0000, Oleksandr Andrushchenko wrote:
> 
> 
> On 15.02.22 12:48, Roger Pau Monné wrote:
> > On Tue, Feb 15, 2022 at 10:11:35AM +0200, Oleksandr Andrushchenko wrote:
> >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> >>
> >> Introduce a per-domain read/write lock to check whether vpci is present,
> >> so we are sure there are no accesses to the contents of the vpci struct
> >> if not. This lock can be used (and in a few cases is used right away)
> >> so that vpci removal can be performed while holding the lock in write
> >> mode. Previously such removal could race with vpci_read for example.
> >>
> >> 1. Per-domain's vpci_rwlock is used to protect pdev->vpci structure
> >> from being removed.
> >>
> >> 2. Writing the command register and ROM BAR register may trigger
> >> modify_bars to run, which in turn may access multiple pdevs while
> >> checking for the existing BAR's overlap. The overlapping check, if done
> >> under the read lock, requires vpci->lock to be acquired on both devices
> >> being compared, which may produce a deadlock. It is not possible to
> >> upgrade read lock to write lock in such a case. So, in order to prevent
> >> the deadlock, check which registers are going to be written and acquire
> >> the lock in the appropriate mode from the beginning.
> >>
> >> All other code, which doesn't lead to pdev->vpci destruction and does not
> >> access multiple pdevs at the same time, can still use a combination of the
> >> read lock and pdev->vpci->lock.
> >>
> >> 3. Optimize if ROM BAR write lock required detection by caching offset
> >> of the ROM BAR register in vpci->header->rom_reg which depends on
> >> header's type.
> >>
> >> 4. Reduce locked region in vpci_remove_device as it is now possible
> >> to set pdev->vpci to NULL early right after the write lock is acquired.
> >>
> >> 5. Reduce locked region in vpci_add_handlers as it is possible to
> >> initialize many more fields of the struct vpci before assigning it to
> >> pdev->vpci.
> >>
> >> 6. vpci_{add|remove}_register are required to be called with the write lock
> >> held, but it is not feasible to add an assert there as it requires
> >> struct domain to be passed for that. So, add a comment about this 
> >> requirement
> >> to these and other functions with the equivalent constraints.
> >>
> >> 7. Drop const qualifier where the new rwlock is used and this is 
> >> appropriate.
> >>
> >> 8. Do not call process_pending_softirqs with any locks held. For that 
> >> unlock
> >> prior the call and re-acquire the locks after. After re-acquiring the
> >> lock there is no need to check if pdev->vpci exists:
> >>   - in apply_map because of the context it is called (no race condition
> >>     possible)
> >>   - for MSI/MSI-X debug code because it is called at the end of
> >>     pdev->vpci access and no further access to pdev->vpci is made
> >>
> >> 9. Check for !pdev->vpci in vpci_{read|write} after acquiring the lock
> >> and if so, allow reading or writing the hardware register directly. This is
> >> acceptable as we only deal with Dom0 as of now. Once DomU support is
> >> added the write will need to be ignored and read return all 0's for the
> >> guests, while Dom0 can still access the registers directly.
> >>
> >> 10. Introduce pcidevs_trylock, so there is a possibility to try locking
> >> the pcidev's lock.
> >>
> >> 11. Use pcidev's lock around for_each_pdev and pci_get_pdev_by_domain
> >> while accessing pdevs in vpci code.
> > So if you use the pcidevs_lock then it's impossible for the pdev or
> > pdev->vpci to be removed or recreated, as the pcidevs lock protects
> > any device operations (add, remove, assign, deassign).
> >
> > It's however not OK to use the pcidevs lock in vpci_{read,write}
> > as-is, as the introduced contention is IMO not acceptable.
> >
> > The only viable option I see here is to:
> >
> >   1. Make the pcidevs lock a rwlock: switch current callers to take the
> >      lock in write mode, detect and fixup any issues that could arise
> >      from the lock not being recursive anymore.
> >   2. Take the lock in read mode around vpci_{read,write} sections that
> >      rely on pdev (including the handlers).
> >
> > These items should be at least two separate patches. Let's not mix the
> > conversion of pcidevs locks with the addition of vPCI support.
> >
> > I think with that we could get away without requiring a per-domain
> > rwlock? Just doing lock ordering in modify_bars regarding
> > tmp->vpci->lock vs pdev->vpci->lock. Neither pdev or vpci can go away
> > while holding the pcidevs lock.
> >
> > Sorting the situation in modify_bars should also be done as a separate
> > patch on top of 1. and 2.
> So, to make it crystal clear: we can do with the locking as in this
> patch and instead we need to convert pcidevs lock into rwlock.
> Meaning that I need to drop this patch.
> 
> Then, 3 patches to follow:
> 1. pcidevs as rwlock
> 2. vpci_{read|write} and the rest using new pcidevs rwlock
> 3. lock ordering in modify_bars
> 
> Is it what we want?

Likely? The current approach of the per-domain rwlock still leaves us
with a window between pci_get_pdev_by_domain and taking such lock
where the device could be removed.

We also need a safe way to use pci_get_pdev_by_domain without the
devices being removed while using them, so it would seem we need the
pcidevs lock anyway, in which case it seems possible to avoid having
to introduce a per-domain rwlock.

I'm happy with any approach that solves the issues we have at hand,
but this proposal has a fundamental flaw of leaving a window after
pci_get_pdev_by_domain where the device could be removed. I'm OK to
have this fixed in a different way if there's one, but if the pcidevs
lock is used in vpci_{read,write} it needs to be converted into a
rwlock.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.