[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 01/12] vpci: introduce per-domain lock to protect vpci structure


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Sun, 9 Jul 2023 22:41:35 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6aVYmq3QfeoquBHeNsvcVznrvM5fa/h/ux57WUL+Zb4=; b=Hanas4I0pCJBg4in52sPLMBMwoP6nN6BDacnaiASrktDnG37jvea4WvDNwOR6Uh5OGLIRPK8s5zYmpSB4D8oxiWAsHTgzHpp/3PXXEAiG1LAUShNfUSyaegsk8s4mAecq7RHqsE+5+pn0eT7jbGbqQkMFNBgL7G77CropG7nthSLEbCSTpN+KL+TAUz0WCFvotxQz66HrZ0QIQrFHx6n4+LPxI8VUGOtcdhWg3ioeTmIRRHRyCJD4VWeP9wy1/sbPGNOjRAAiSg2En/Lviadc1I2QiahwHfoUfy9V2RnnRiXDdhNcOqnwvzYS5Jw7RQ3fYqlIUn4xQDQrWkt339vQQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZYdRKwNmKTRwOJkBUYA896IQRLc1T1kpA2lrTW3hB+0bW565NDMno9bEueM+taMq6wcyLdXzk/punzL1sGBCgJmSKKUQcwBRmtzhGV9nhgb8a4tFejwPN9FzWY2132FRWMtbtTmhzhOOGEPc4g2y4MmZLsd4RbuwYbsir9qkRdkQpzsovJQs2a+boJxn0Hx62faTtpD34hmMHG+g5ALRVX7VTgmJXbq6+SYCkEYQh+dzbpEAJJriqzr2W6/AXqvYMXviCdeAhvrQjCgpT/7C6vEEQeiXKJ0oVyCQOs/LAzhKVn+zttPtfXC79YxbQ90ZKMtQMVKiHyt4SmUcnKTdIw==
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Sun, 09 Jul 2023 22:42:18 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHZneJZdDE+HOoSO0y6VB9l0x1hDq+No9OAgAgqUACAALl2AIAAzUWAgADPAYCAEhJlgIAArbcAgAdCCQA=
  • Thread-topic: [PATCH v7 01/12] vpci: introduce per-domain lock to protect vpci structure

Hi Jan,

Jan Beulich <jbeulich@xxxxxxxx> writes:

> On 04.07.2023 23:03, Volodymyr Babchuk wrote:
>> I am currently implementing your proposal (along with Jan's
>> suggestions), but I am facing ABBA deadlock with IOMMU's
>> reassign_device() call, which has this piece of code:
>> 
>>         list_move(&pdev->domain_list, &target->pdev_list);
>> 
>> My immediate change was:
>> 
>>         write_lock(&pdev->domain->pci_lock);
>>         list_del(&pdev->domain_list);
>>         write_unlock(&pdev->domain->pci_lock);
>> 
>>         write_lock(&target->pci_lock);
>>         list_add(&pdev->domain_list, &target->pdev_list);
>>         write_unlock(&target->pci_lock);
>> 
>> But this will not work because reassign_device is called from
>> pci_release_devices() which iterates over d->pdev_list, so we need to
>> take a d->pci_lock early.
>> 
>> Any suggestions on how to fix this? My idea is to remove a device from a
>> list one at time:
>> 
>> int pci_release_devices(struct domain *d)
>> {
>>     struct pci_dev *pdev;
>>     u8 bus, devfn;
>>     int ret;
>> 
>>     pcidevs_lock();
>>     write_lock(&d->pci_lock);
>>     ret = arch_pci_clean_pirqs(d);
>>     if ( ret )
>>     {
>>         pcidevs_unlock();
>>         write_unlock(&d->pci_lock);
>>         return ret;
>>     }
>> 
>>     while ( !list_empty(&d->pdev_list) )
>>     {
>>         pdev = list_entry(&d->pdev_list, struct pci_dev, domain_list);
>>         bus = pdev->bus;
>>         devfn = pdev->devfn;
>>         list_del(&pdev->domain_list);
>>         write_unlock(&d->pci_lock);
>>         ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret;
>>         write_lock(&d->pci_lock);
>
> I think it needs doing almost like this, but with two more tweaks and
> no list_del() right here (first and foremost to avoid needing to
> figure whether removing early isn't going to subtly break anything;
> see below for an error case that would end up with changed behavior):
>
>     while ( !list_empty(&d->pdev_list) )
>     {
>         const struct pci_dev *pdev = list_first_entry(&d->pdev_list, struct 
> pci_dev, domain_list);
>         uint16_t seg = pdev->seg;
>         uint8_t bus = pdev->bus;
>         uint8_t devfn = pdev->devfn;
>
>         write_unlock(&d->pci_lock);
>         ret = deassign_device(d, seg, bus, devfn) ?: ret;
>         write_lock(&d->pci_lock);
>     }
>
> One caveat though: The original list_for_each_entry_safe() guarantees
> the loop to complete; your use of list_del() would guarantee that too,
> but then the device wouldn't be on the list anymore if deassign_device()
> failed. Therefore I guess you will need another local list where you
> (temporarily) put all the devices which deassign_device() left on the
> list, and which you would then move back to d->pdev_list after the loop
> has finished.

Okay, taking into the account all that you wrote, I came with this
implementation:

int pci_release_devices(struct domain *d)
{
    int combined_ret;
    LIST_HEAD(failed_pdevs);

    pcidevs_lock();
    write_lock(&d->pci_lock);
    combined_ret = arch_pci_clean_pirqs(d);
    if ( combined_ret )
    {
        pcidevs_unlock();
        write_unlock(&d->pci_lock);
        return combined_ret;
    }

    while ( !list_empty(&d->pdev_list) )
    {
        struct pci_dev *pdev = list_first_entry(&d->pdev_list,
                                                struct pci_dev,
                                                domain_list);
        uint16_t seg = pdev->seg;
        uint8_t bus = pdev->bus;
        uint8_t devfn = pdev->devfn;
        int ret;

        write_unlock(&d->pci_lock);
        ret = deassign_device(d, seg, bus, devfn);
        write_lock(&d->pci_lock);
        if ( ret )
        {
            bool still_present = false;
            const struct pci_dev *tmp;

            for_each_pdev ( d, tmp )
            {
                if ( tmp == (const struct pci_dev*)pdev )
                {
                    still_present = true;
                    break;
                }
            }
            if ( still_present )
                list_move(&pdev->domain_list, &failed_pdevs);
            combined_ret = ret;
        }
    }

    list_splice(&failed_pdevs, &d->pdev_list);
    write_unlock(&d->pci_lock);
    pcidevs_unlock();

    return combined_ret;
}


> (Whether it is sufficient to inspect the list head to
> determine whether the pdev is still on the list needs careful checking.)

I am not sure that this is enough. We dropped d->pci_lock so we can
expect that the list was permutated in a random way.


-- 
WBR, Volodymyr


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.