[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 01/12] vpci: introduce per-domain lock to protect vpci structure


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 5 Jul 2023 10:59:04 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vG5yjm9LnzT5cClrs/a/AivTKV0lSsBI0fIE2N1cEy8=; b=PB3aAldRPgOpPN1v8IxxVPr7U/GGeUahz2nxsE8fd6TKf2QAoTSKU56x8lUG4jNNU77Hat7iiywymtAzNnZY1IIDZMD6FIMGM0NSfNj6kqO56Jy7SiIvMxje3nU1MpGRz+ujMgWty6zc/BRqYsMvMZKth8dOpMcbhCjRXFLWCEveQgKK+7qtImejijQ515dTQlqZajHUQtDERdmgZ9eU3dNV4ZKjzogZL+xo4zAXPrP8Np8+sG5SrDIKX+RVQJqqedCyeRZo/WNZsnAlytlqlrl5pXOE9LAcmFt6ttx3AX8FYqx6ea45Gorenh7RB6OI7WVRmCuAGWofQ7htjgV/oA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZB4PK9ZNgQSx/o7fYZHTzH7L0eTaUdUQaWZ4RGDseuAC3WRW7oHKxamDsFgo/yLUIyzAVKgR9IUYF4VlEbqwsFXPmBGgR/w9Pc+cy702rHJeC16Xi01fo34qLOHl4xpVZJVXkoayQhUNlQNgcfjS+cDuCbGanOgU8nJ6VSv2AETwmzcX0870xH16xV+s/X+LOaj6qvUgxIdVijQp8Q1svOmiOBy5UQPZebwEcWYMcLiPYEabWs+31UTPeP7+sHNwkDp3sNzgim5HPeMSaI/+rLLVuT4T4sEWIgbMcGBkBydHAeKWOdQ+Ix5KLMjkpcG8NhPG0Ucjck9IjBCOjOnU4w==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 05 Jul 2023 08:59:21 +0000
  • Ironport-data: A9a23:Yp8Sdq3exf7+DMztMvbD5UFwkn2cJEfYwER7XKvMYLTBsI5bpzdSx 2FJDDyPPP2IZ2X9e9hyaY2z/UkAvJHQzNNjHgFqpC1hF35El5HIVI+TRqvS04F+DeWYFR46s J9OAjXkBJppJpMJjk71atANlVEliefTAOK6ULWeUsxIbVcMYD87jh5+kPIOjIdtgNyoayuAo tq3qMDEULOf82cc3lk8teTb8XuDgNyo4GlD5gNmOKgS1LPjvyJ94Kw3dPnZw0TQGuG4LsbiL 87fwbew+H/u/htFIrtJRZ6iLyXm6paLVeS/oiI+t5qK23CulQRrukoPD9IOaF8/ttm8t4sZJ OOhF3CHYVxB0qXkwIzxWvTDes10FfUuFLTveRBTvSEPpqFvnrSFL/hGVSkL0YMkFulfKz5ip dMDCWs2awmj2eCMy56AeOB9v5F2RCXrFNt3VnBI6xj8VK5ja7acBqLA6JlfwSs6gd1IEbDGf c0FZDFzbRPGJRpSJlMQD5F4l+Ct7pX9W2QA9BTJ+uxouy6KlFYZPLvFabI5fvSQQspYhACAr 3/u9GXlGBAKcteYzFJp91r137KVx3qiCd96+LuQ3fhWmQyInVwpVRhOFmm9hfL+12S0cocKQ 6AT0m90xUQoz2SpRNTgWxyzoFafowURHdFXFoUS1gaJzabF5heDMUINRDVBdd8Oudc/QHoh0 Vrht8PkA3ljvaOYTVqZ96yItnWiNC4NN2gAaCQYCwwf7LHLh4U+jQnGSNp5J4ezgsfoAjH7w z2Mry8Wiq0aiIgA0KDT1U/DqyKhoN7OVAFdzgfKWmOo6CtpaYjjYJangXDA9upJJoudSliHv VAHltKY4eRICouC/ASVSfgJNKGk4bCCKjK0vLJ0N5wo9jDo8Xj8e4lVuWl6PB0wapxCfiL1a kjOvw8X/IVUIHahca5wZcS2Ftguyq/jU9/iU5g4c+ZzX3S4TyfflAkGWKJa9zqFfJQE+U3nB aqmTA==
  • Ironport-hdrordr: A9a23:18CFgKgA8G2uMG0vp3O01CnHFXBQXuUji2hC6mlwRA09TyVXrb HWoB17726NtN91YhsdcL+7Scy9qB/nhPxICMwqTNSftWrd2VdATrsSibcKqgeIc0bDH6xmtZ uIGJIOb+EYY2IK6/oSIzPVLz/j+rS6GWyT6ts2Bk0CcT1X
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Jul 05, 2023 at 09:11:10AM +0200, Jan Beulich wrote:
> On 04.07.2023 23:03, Volodymyr Babchuk wrote:
> > I am currently implementing your proposal (along with Jan's
> > suggestions), but I am facing ABBA deadlock with IOMMU's
> > reassign_device() call, which has this piece of code:
> > 
> >         list_move(&pdev->domain_list, &target->pdev_list);
> > 
> > My immediate change was:
> > 
> >         write_lock(&pdev->domain->pci_lock);
> >         list_del(&pdev->domain_list);
> >         write_unlock(&pdev->domain->pci_lock);
> > 
> >         write_lock(&target->pci_lock);
> >         list_add(&pdev->domain_list, &target->pdev_list);
> >         write_unlock(&target->pci_lock);
> > 
> > But this will not work because reassign_device is called from
> > pci_release_devices() which iterates over d->pdev_list, so we need to
> > take a d->pci_lock early.
> > 
> > Any suggestions on how to fix this? My idea is to remove a device from a
> > list one at time:
> > 
> > int pci_release_devices(struct domain *d)
> > {
> >     struct pci_dev *pdev;
> >     u8 bus, devfn;
> >     int ret;
> > 
> >     pcidevs_lock();
> >     write_lock(&d->pci_lock);
> >     ret = arch_pci_clean_pirqs(d);
> >     if ( ret )
> >     {
> >         pcidevs_unlock();
> >         write_unlock(&d->pci_lock);
> >         return ret;
> >     }
> > 
> >     while ( !list_empty(&d->pdev_list) )
> >     {
> >         pdev = list_entry(&d->pdev_list, struct pci_dev, domain_list);
> >         bus = pdev->bus;
> >         devfn = pdev->devfn;
> >         list_del(&pdev->domain_list);
> >         write_unlock(&d->pci_lock);
> >         ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret;
> >         write_lock(&d->pci_lock);
> 
> I think it needs doing almost like this, but with two more tweaks and
> no list_del() right here (first and foremost to avoid needing to
> figure whether removing early isn't going to subtly break anything;
> see below for an error case that would end up with changed behavior):
> 
>     while ( !list_empty(&d->pdev_list) )
>     {
>         const struct pci_dev *pdev = list_first_entry(&d->pdev_list, struct 
> pci_dev, domain_list);
>         uint16_t seg = pdev->seg;
>         uint8_t bus = pdev->bus;
>         uint8_t devfn = pdev->devfn;
> 
>         write_unlock(&d->pci_lock);

I think you need to remove the device from the pdev_list before
dropping the lock, or else release could race with other operations.

That's unlikely, but still if the lock is dropped and the routine
needs to operate on the device it is better remove such device from
the domain so other operations cannot get a reference to it.

Otherwise you could modify reassign_device() implementations so they
require the caller to hold the source->pci_lock when calling the
routine, but that's ugly because the lock would need to be dropped in
order to reassign the device from source to target domains.

Another option would be to move the whole d->pdev_list to a local
variable (so that d->pdev_list would be empty) and then iterate over
it without the d->pci_lock.  On failure you would take the lock and
add the failing device back into d->pdev_list.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.