[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 03/13] vpci: move lock outside of struct vpci


  • To: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 7 Feb 2022 13:46:57 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3KuWjwX3k3PvN++BhxGf/KpKmjdqGgmetGnkm6W1AME=; b=CIAsvP7/KxtPPM+1Ov6DssQyx1RNVSRBJtZADmdBl7LqNV2K2HPfNdBhDKLTYWuacQKxeE9EAuBtdhG7IEOShSnKXHqGr2p1WmntFb6TQtb9o2Q8ydeOOAJBBtNJMwlhgxDrIlUi1jEdCViYv0gRsIP3Vw6J+tK2PVQoSdlcJKhi9Ogqi7clJWchJOntcUFRQTE54GKyMVSYvciNe1t9zluAFyWgQih8v9noTWswo4MzS9KkDtl/md1FTQXtYfBTsGfQ4IzY86kHKKgigKCLLmIGbBJl1V5PJ0OnQnhGCIMG5Co7CaZO7YEo23kL3IJoKlgSKaD6aOHWlhyHog94JA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YfqTni3B5TxJrLGsfm8JtrjtnT/gv+HrrzfG9uO4bN04pJaO1a5tdrGeZO41eMWuAaXhzRJd11wOzvDiGT0iwEyv0jIpd0y7l/UEy1oS+Gf839/huqKmuvSGCEqEfISXoITOvLWt7AO8icm46t3GMzg+MWCd4fnAStezwlvTmQC9kR5UyJ+cWEwnU/MT8FYYCchD5enU17I4IHMhUWBZlFZMmHRcGVwB8zYclgMaWWmp+zvxC0RKr7B20rYZgXC+SR+UTsumTmQjvQxT8KH/PxBdksf7IwniOchtrc1cnf6wCRhXcf0vjKjo23jPMP2nHDVNoRstuSV8j6SNObejjA==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>, "paul@xxxxxxx" <paul@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 07 Feb 2022 12:47:21 +0000
  • Ironport-data: A9a23:I78iBa4KV7PUh9byCHx4DgxRtBjBchMFZxGqfqrLsTDasY5as4F+v mJJXjqHM6zba2X1L41yOtmyphhUsJ7UyYdrGVRr+Cg9Hi5G8cbLO4+Ufxz6V8+wwmwvb67FA +E2MISowBUcFyeEzvuV3zyIQUBUjclkfJKlYAL/En03FV8MpBsJ00o5wbZj2tEw2LBVPivW0 T/Mi5yHULOa82Yc3lI8s8pvfzs24ZweEBtB1rAPTagjUG32zhH5P7pGTU2FFFPqQ5E8IwKPb 72rIIdVXI/u10xF5tuNyt4Xe6CRK1LYFVDmZnF+A8BOjvXez8CbP2lS2Pc0MC9qZzu1c99Z6 twW5aa9VQISIbCTycYQSyQGIwFTBPgTkFPHCSDXXc27ykTHdz3nwul0DVFwNoodkgp1KTgQr 7pCcmlLN03dwbLtqF64YrAEasALNs7kMZlZonh95TrYEewnUdbIRKCiCdpwgmto2pwTTKq2i 8wxQjFfZwn7RyJ2PkoVMZEymryz3SnUWmgNwL6SjfVuuDWCpOBr65D1OcfRUsyHQ4NShEnwj kvc42n8NTQLO9WexCSt/2qlg6nEmiaTcIgfDqGi//hmxlia3HUOCQY+XEG+5/K+jyaWS99Zb kAZ5Ccqhawz71CwCMnwWQWip3yJtQJaXMBfe8Ug4QGQzuzP4gCWBkANVDsHY9sj3OcIQjgt2 k6MjsneLzVlu72ISlqQ7r6R6zi1PEA9L2UPeCsFRgst+MT4rcc4iRenZvFnHa2uh9v5AwbZx TyQsTM+jLUei80M/6ij9FWBiDWpzrDLUwo06wP/Tm+jqARja+aNQIil6kPS6/paG7qIVVmKv HUCmM+24fgHCNeGkynlaP4WALij6vKBMTvdqV1iBZ8s83Kq4XHLQGxLyGggfgEzaJ9CIGK3J h+I0e9M2HNNFFmjNv9Nbp60MNs3kvnCGMzYCMD0Q+MbN/CdazS71C1pYEeR2UXkn04tjbwzN P+nTCq8MZoJIf85lWTrHo/xxZdun3ljnj2LGfgX2jz6ieL2WZKDdVsS3LJihMgd5bjMngja+ s032yCim0QGC72WjsU6HOcuwbE2wZoTWMqeRy9/LLfrzu9a9IYJUa65/F/ZU9Y595m5b8+Rl p1HZmdWyUDkmVrMIhiQZ3ZoZdvHBMgj8StqZHVybAzxgBDPhLpDC49FJvMKkUQPrrQ/nZaYs dFZEyl/Phi/YmueoGlMBXUMhIdjaA6qlWqz09mNO1ACk2pbb1WRoLfMJ1K3nAFXV3bfnZZu8 tWIi1KAKbJeFlsKJJiNMpqHkQju1UXxbcovBiMk1PEIIx6ymGWrQgSs5sIKzzYkc0malmDEh lrJXX/1Z4Dl+ucIzTUAvojdx6+BGOpiBEtKWW7d6Le9Ly7B+WS/h4RHVY61kfr1DQsYIY2uO rdYye/SKvoCkAoYuoZwCe8zn6k/+8Hut/lRyQE9RCfHaFGiC7VBJHia3JYQ6v0Rl+EB4QbmC FiS/tR6OKmSPJ+3GlAmOwd4PP+I0usZm2eO4K1tcln6/iJ+4JGOTV5WY0uXkCVYIbYsaNElz O4ttdQ48Qu6jhZ2YN+KgjoNrzaHL2AaUrVhvZYfWde5hg0uw1BEQJrdFi6pv83fN4QSahEne 2bGirDDirJQwlv5X0AyTXWdj/BAgZkuuQxRyANQLVq+hdeY1OQ82wdc8GprQ10NnAlHye96J kNiK1ZxefeV5z5ticVOAzKsFgVGCEHL80D90QJUxmjQTk3uXW3RNmwtf+2K+RlBoW5bezFa+ pCeyXrkDmm2LJ2ggHNqVB43seHnQPxw6hbGyZKuEMmyFpUnZSbo3/21bm0Sphq7Wc48iSUrf wWxEDqcvUEjCRMtng==
  • Ironport-hdrordr: A9a23:JP4JXalLjuNkONqWlugNqtaxoMHpDfIf3DAbv31ZSRFFG/Fwwf re5cjztCWE7Qr4Ohkb8+xoXZPsfZqyz/JICOUqUotKPzOW2ldATrsD0WK4+UyHJ8SWzIc0vp uIFZIRNDSaNykYsS+V2miF+3lL+qj+zEgF792uq0uE7GtRGsZd0zs=
  • Ironport-sdr: C77i+etr9D9s9HzZMGOP5iruX7lF1LAegDo276ZtMBgn+z6hMzDHlEpj3H4VHoyVChg7kKjiDv 8KsR1JJiG6MR6oSCzEgCX9mJAFcaX+bJG6Srew7a29qQqZXg8ITp+u7lzqsETvcMa6JTgMsKAi jJLFzd4wx0ij8jTI14q+ztI0df7OYKuXHOcWJWL1fgFmoTZtHXJjzTuzgsG9KesHgbwSxMsJgp 7Ue1DbTTy0K7CqtauucUkJGK9OPC8q8CQxkreDEC1vwYdym802LY4vXVJTUdtmTficOeTR+faF cCQm0DcvFaqrOSudc8Mj0Et0
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Feb 07, 2022 at 11:08:39AM +0000, Oleksandr Andrushchenko wrote:
> Hello,
> 
> On 04.02.22 16:57, Roger Pau Monné wrote:
> > On Fri, Feb 04, 2022 at 02:43:07PM +0000, Oleksandr Andrushchenko wrote:
> >>
> >> On 04.02.22 15:06, Roger Pau Monné wrote:
> >>> On Fri, Feb 04, 2022 at 12:53:20PM +0000, Oleksandr Andrushchenko wrote:
> >>>> On 04.02.22 14:47, Jan Beulich wrote:
> >>>>> On 04.02.2022 13:37, Oleksandr Andrushchenko wrote:
> >>>>>> On 04.02.22 13:37, Jan Beulich wrote:
> >>>>>>> On 04.02.2022 12:13, Roger Pau Monné wrote:
> >>>>>>>> On Fri, Feb 04, 2022 at 11:49:18AM +0100, Jan Beulich wrote:
> >>>>>>>>> On 04.02.2022 11:12, Oleksandr Andrushchenko wrote:
> >>>>>>>>>> On 04.02.22 11:15, Jan Beulich wrote:
> >>>>>>>>>>> On 04.02.2022 09:58, Oleksandr Andrushchenko wrote:
> >>>>>>>>>>>> On 04.02.22 09:52, Jan Beulich wrote:
> >>>>>>>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
> >>>>>>>>>>>>>> @@ -285,6 +286,12 @@ static int modify_bars(const struct 
> >>>>>>>>>>>>>> pci_dev *pdev, uint16_t cmd, bool rom_only)
> >>>>>>>>>>>>>>                        continue;
> >>>>>>>>>>>>>>                }
> >>>>>>>>>>>>>>        
> >>>>>>>>>>>>>> +        spin_lock(&tmp->vpci_lock);
> >>>>>>>>>>>>>> +        if ( !tmp->vpci )
> >>>>>>>>>>>>>> +        {
> >>>>>>>>>>>>>> +            spin_unlock(&tmp->vpci_lock);
> >>>>>>>>>>>>>> +            continue;
> >>>>>>>>>>>>>> +        }
> >>>>>>>>>>>>>>                for ( i = 0; i < 
> >>>>>>>>>>>>>> ARRAY_SIZE(tmp->vpci->header.bars); i++ )
> >>>>>>>>>>>>>>                {
> >>>>>>>>>>>>>>                    const struct vpci_bar *bar = 
> >>>>>>>>>>>>>> &tmp->vpci->header.bars[i];
> >>>>>>>>>>>>>> @@ -303,12 +310,14 @@ static int modify_bars(const struct 
> >>>>>>>>>>>>>> pci_dev *pdev, uint16_t cmd, bool rom_only)
> >>>>>>>>>>>>>>                    rc = rangeset_remove_range(mem, start, end);
> >>>>>>>>>>>>>>                    if ( rc )
> >>>>>>>>>>>>>>                    {
> >>>>>>>>>>>>>> +                spin_unlock(&tmp->vpci_lock);
> >>>>>>>>>>>>>>                        printk(XENLOG_G_WARNING "Failed to 
> >>>>>>>>>>>>>> remove [%lx, %lx]: %d\n",
> >>>>>>>>>>>>>>                               start, end, rc);
> >>>>>>>>>>>>>>                        rangeset_destroy(mem);
> >>>>>>>>>>>>>>                        return rc;
> >>>>>>>>>>>>>>                    }
> >>>>>>>>>>>>>>                }
> >>>>>>>>>>>>>> +        spin_unlock(&tmp->vpci_lock);
> >>>>>>>>>>>>>>            }
> >>>>>>>>>>>>> At the first glance this simply looks like another unjustified 
> >>>>>>>>>>>>> (in the
> >>>>>>>>>>>>> description) change, as you're not converting anything here but 
> >>>>>>>>>>>>> you
> >>>>>>>>>>>>> actually add locking (and I realize this was there before, so 
> >>>>>>>>>>>>> I'm sorry
> >>>>>>>>>>>>> for not pointing this out earlier).
> >>>>>>>>>>>> Well, I thought that the description already has "...the lock 
> >>>>>>>>>>>> can be
> >>>>>>>>>>>> used (and in a few cases is used right away) to check whether 
> >>>>>>>>>>>> vpci
> >>>>>>>>>>>> is present" and this is enough for such uses as here.
> >>>>>>>>>>>>>        But then I wonder whether you
> >>>>>>>>>>>>> actually tested this, since I can't help getting the impression 
> >>>>>>>>>>>>> that
> >>>>>>>>>>>>> you're introducing a live-lock: The function is called from 
> >>>>>>>>>>>>> cmd_write()
> >>>>>>>>>>>>> and rom_write(), which in turn are called out of vpci_write(). 
> >>>>>>>>>>>>> Yet that
> >>>>>>>>>>>>> function already holds the lock, and the lock is not (currently)
> >>>>>>>>>>>>> recursive. (For the 3rd caller of the function - init_bars() - 
> >>>>>>>>>>>>> otoh
> >>>>>>>>>>>>> the locking looks to be entirely unnecessary.)
> >>>>>>>>>>>> Well, you are correct: if tmp != pdev then it is correct to 
> >>>>>>>>>>>> acquire
> >>>>>>>>>>>> the lock. But if tmp == pdev and rom_only == true
> >>>>>>>>>>>> then we'll deadlock.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It seems we need to have the locking conditional, e.g. only lock
> >>>>>>>>>>>> if tmp != pdev
> >>>>>>>>>>> Which will address the live-lock, but introduce ABBA deadlock 
> >>>>>>>>>>> potential
> >>>>>>>>>>> between the two locks.
> >>>>>>>>>> I am not sure I can suggest a better solution here
> >>>>>>>>>> @Roger, @Jan, could you please help here?
> >>>>>>>>> Well, first of all I'd like to mention that while it may have been 
> >>>>>>>>> okay to
> >>>>>>>>> not hold pcidevs_lock here for Dom0, it surely needs acquiring when 
> >>>>>>>>> dealing
> >>>>>>>>> with DomU-s' lists of PCI devices. The requirement really applies 
> >>>>>>>>> to the
> >>>>>>>>> other use of for_each_pdev() as well (in vpci_dump_msi()), except 
> >>>>>>>>> that
> >>>>>>>>> there it probably wants to be a try-lock.
> >>>>>>>>>
> >>>>>>>>> Next I'd like to point out that here we have the still pending 
> >>>>>>>>> issue of
> >>>>>>>>> how to deal with hidden devices, which Dom0 can access. See my RFC 
> >>>>>>>>> patch
> >>>>>>>>> "vPCI: account for hidden devices in modify_bars()". Whatever the 
> >>>>>>>>> solution
> >>>>>>>>> here, I think it wants to at least account for the extra need there.
> >>>>>>>> Yes, sorry, I should take care of that.
> >>>>>>>>
> >>>>>>>>> Now it is quite clear that pcidevs_lock isn't going to help with 
> >>>>>>>>> avoiding
> >>>>>>>>> the deadlock, as it's imo not an option at all to acquire that lock
> >>>>>>>>> everywhere else you access ->vpci (or else the vpci lock itself 
> >>>>>>>>> would be
> >>>>>>>>> pointless). But a per-domain auxiliary r/w lock may help: Other 
> >>>>>>>>> paths
> >>>>>>>>> would acquire it in read mode, and here you'd acquire it in write 
> >>>>>>>>> mode (in
> >>>>>>>>> the former case around the vpci lock, while in the latter case 
> >>>>>>>>> there may
> >>>>>>>>> then not be any need to acquire the individual vpci locks at all). 
> >>>>>>>>> FTAOD:
> >>>>>>>>> I haven't fully thought through all implications (and hence whether 
> >>>>>>>>> this is
> >>>>>>>>> viable in the first place); I expect you will, documenting what 
> >>>>>>>>> you've
> >>>>>>>>> found in the resulting patch description. Of course the double lock
> >>>>>>>>> acquire/release would then likely want hiding in helper functions.
> >>>>>>>> I've been also thinking about this, and whether it's really worth to
> >>>>>>>> have a per-device lock rather than a per-domain one that protects all
> >>>>>>>> vpci regions of the devices assigned to the domain.
> >>>>>>>>
> >>>>>>>> The OS is likely to serialize accesses to the PCI config space 
> >>>>>>>> anyway,
> >>>>>>>> and the only place I could see a benefit of having per-device locks 
> >>>>>>>> is
> >>>>>>>> in the handling of MSI-X tables, as the handling of the mask bit is
> >>>>>>>> likely very performance sensitive, so adding a per-domain lock there
> >>>>>>>> could be a bottleneck.
> >>>>>>> Hmm, with method 1 accesses serializing globally is basically
> >>>>>>> unavoidable, but with MMCFG I see no reason why OSes may not (move
> >>>>>>> to) permit(ting) parallel accesses, with serialization perhaps done
> >>>>>>> only at device level. See our own pci_config_lock, which applies to
> >>>>>>> only method 1 accesses; we don't look to be serializing MMCFG
> >>>>>>> accesses at all.
> >>>>>>>
> >>>>>>>> We could alternatively do a per-domain rwlock for vpci and special 
> >>>>>>>> case
> >>>>>>>> the MSI-X area to also have a per-device specific lock. At which 
> >>>>>>>> point
> >>>>>>>> it becomes fairly similar to what you propose.
> >>>>>> @Jan, @Roger
> >>>>>>
> >>>>>> 1. d->vpci_lock - rwlock <- this protects vpci
> >>>>>> 2. pdev->vpci->msix_tbl_lock - rwlock <- this protects MSI-X tables
> >>>>>> or should it better be pdev->msix_tbl_lock as MSI-X tables don't
> >>>>>> really depend on vPCI?
> >>>>> If so, perhaps indeed better the latter. But as said in reply to Roger,
> >>>>> I'm not convinced (yet) that doing away with the per-device lock is a
> >>>>> good move. As said there - we're ourselves doing fully parallel MMCFG
> >>>>> accesses, so OSes ought to be fine to do so, too.
> >>>> But with pdev->vpci_lock we face ABBA...
> >>> I think it would be easier to start with a per-domain rwlock that
> >>> guarantees pdev->vpci cannot be removed under our feet. This would be
> >>> taken in read mode in vpci_{read,write} and in write mode when
> >>> removing a device from a domain.
> >>>
> >>> Then there are also other issues regarding vPCI locking that need to
> >>> be fixed, but that lock would likely be a start.
> >> Or let's see the problem at a different angle: this is the only place
> >> which breaks the use of pdev->vpci_lock. Because all other places
> >> do not try to acquire the lock of any two devices at a time.
> >> So, what if we re-work the offending piece of code instead?
> >> That way we do not break parallel access and have the lock per-device
> >> which might also be a plus.
> >>
> >> By re-work I mean, that instead of reading already mapped regions
> >> from tmp we can employ a d->pci_mapped_regions range set which
> >> will hold all the already mapped ranges. And when it is needed to access
> >> that range set we use pcidevs_lock which seems to be rare.
> >> So, modify_bars will rely on pdev->vpci_lock + pcidevs_lock and
> >> ABBA won't be possible at all.
> > Sadly that won't replace the usage of the loop in modify_bars. This is
> > not (exclusively) done in order to prevent mapping the same region
> > multiple times, but rather to prevent unmapping of regions as long as
> > there's an enabled BAR that's using it.
> >
> > If you wanted to use something like d->pci_mapped_regions it would
> > have to keep reference counts to regions, in order to know when a
> > mapping is no longer required by any BAR on the system with memory
> > decoding enabled.
> I missed this path, thank you
> 
> I tried to analyze the locking in pci/vpci.
> 
> First of all some context to refresh the target we want:
> the rationale behind moving pdev->vpci->lock outside
> is to be able dynamically create and destroy pdev->vpci.
> So, for that reason lock needs to be moved outside of the pdev->vpci.
> 
> Some of the callers of the vPCI code and locking used:
> 
> ======================================
> vpci_mmio_read/vpci_mmcfg_read
> ======================================
>    - vpci_ecam_read
>    - vpci_read
>     !!!!!!!! pdev is acquired, then pdev->vpci_lock is used !!!!!!!!
>     - msix:
>      - control_read
>     - header:
>      - guest_bar_read
>     - msi:
>      - control_read
>      - address_read/address_hi_read
>      - data_read
>      - mask_read
> 
> ======================================
> vpci_mmio_write/vpci_mmcfg_write
> ======================================
>    - vpci_ecam_write
>    - vpci_write
>     !!!!!!!! pdev is acquired, then pdev->vpci_lock is used !!!!!!!!
>     - msix:
>      - control_write
>     - header:
>      - bar_write/guest_bar_write
>      - cmd_write/guest_cmd_write
>      - rom_write
>       - all write handlers may call modify_bars
>        modify_bars
>     - msi:
>      - control_write
>      - address_write/address_hi_write
>      - data_write
>      - mask_write
> 
> ======================================
> pci_add_device: locked with pcidevs_lock
> ======================================
>    - vpci_add_handlers
>     ++++++++ pdev->vpci_lock is used ++++++++
> 
> ======================================
> pci_remove_device: locked with pcidevs_lock
> ======================================
> - vpci_remove_device
>    ++++++++ pdev->vpci_lock is used ++++++++
> - pci_cleanup_msi
> - free_pdev
> 
> ======================================
> XEN_DOMCTL_assign_device: locked with pcidevs_lock
> ======================================
> - assign_device
>   - vpci_deassign_device
>   - pdev_msix_assign
>   - vpci_assign_device
>    - vpci_add_handlers
>      ++++++++ pdev->vpci_lock is used ++++++++
> 
> ======================================
> XEN_DOMCTL_deassign_device: locked with pcidevs_lock
> ======================================
> - deassign_device
>   - vpci_deassign_device
>     ++++++++ pdev->vpci_lock is used ++++++++
>    - vpci_remove_device
> 
> 
> ======================================
> modify_bars is a special case: this is the only function which tries to lock
> two pci_dev devices: it is done to check for overlaps with other BARs which 
> may have been
> already mapped or unmapped.
> 
> So, this is the only case which may deadlock because of pci_dev->vpci_lock.
> ======================================
> 
> Bottom line:
> ======================================
> 
> 1. vpci_{read|write} are not protected with pcidevs_lock and can run in
> parallel with pci_remove_device which can remove pdev after vpci_{read|write}
> acquired the pdev pointer. This may lead to a fail due to pdev dereference.
> 
> So, to protect pdev dereference vpci_{read|write} must also use pdevs_lock.

We would like to take the pcidevs_lock only while fetching the device
(ie: pci_get_pdev_by_domain), afterwards it should be fine to lock the
device using a vpci specific lock so calls to vpci_{read,write} can be
partially concurrent across multiple domains.

In fact I think Jan had already pointed out that the pci lock would
need taking while searching for the device in vpci_{read,write}.

It seems to me that if you implement option 3 below taking the
per-domain rwlock in read mode in vpci_{read|write} will already
protect you from the device being removed if the same per-domain lock
is taken in write mode in vpci_remove_device.

> 2. The only offending place which is in the way of pci_dev->vpci_lock is
> modify_bars. If it can be re-worked to track already mapped and unmapped
> regions then we can avoid having a possible deadlock and can use
> pci_dev->vpci_lock (rangesets won't help here as we also need refcounting be
> implemented).

I think a refcounting based solution will be very complex to
implement. I'm however happy to be proven wrong.

> If pcidevs_lock is used for vpci_{read|write} then no deadlock is possible,
> but modify_bars code must be re-worked not to lock itself (pdev->vpci_lock and
> tmp->vpci_lock when pdev == tmp, this is minor).

Taking the pcidevs lock (a global lock) is out of the picture IMO, as
it's going to serialize all calls of vpci_{read|write}, and would
create too much contention on the pcidevs lock.

> 3. We may think about a per-domain rwlock and pdev->vpci_lock, so this solves
> modify_bars's two pdevs access. But this doesn't solve possible pdev
> de-reference in vpci_{read|write} vs pci_remove_device.

pci_remove device will call vpci_remove_device, so as long as
vpci_remove_device taken the per-domain lock in write (exclusive) mode
it should be fine.

> @Roger, @Jan, I would like to hear what do you think about the above analysis
> and how can we proceed with locking re-work?

I think the per-domain rwlock seems like a good option. I would do
that as a pre-patch.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.