[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 03/13] vpci: move lock outside of struct vpci



Hi, Jan!

On 04.02.22 11:15, Jan Beulich wrote:
> On 04.02.2022 09:58, Oleksandr Andrushchenko wrote:
>> On 04.02.22 09:52, Jan Beulich wrote:
>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>> @@ -285,6 +286,12 @@ static int modify_bars(const struct pci_dev *pdev, 
>>>> uint16_t cmd, bool rom_only)
>>>>                    continue;
>>>>            }
>>>>    
>>>> +        spin_lock(&tmp->vpci_lock);
>>>> +        if ( !tmp->vpci )
>>>> +        {
>>>> +            spin_unlock(&tmp->vpci_lock);
>>>> +            continue;
>>>> +        }
>>>>            for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
>>>>            {
>>>>                const struct vpci_bar *bar = &tmp->vpci->header.bars[i];
>>>> @@ -303,12 +310,14 @@ static int modify_bars(const struct pci_dev *pdev, 
>>>> uint16_t cmd, bool rom_only)
>>>>                rc = rangeset_remove_range(mem, start, end);
>>>>                if ( rc )
>>>>                {
>>>> +                spin_unlock(&tmp->vpci_lock);
>>>>                    printk(XENLOG_G_WARNING "Failed to remove [%lx, %lx]: 
>>>> %d\n",
>>>>                           start, end, rc);
>>>>                    rangeset_destroy(mem);
>>>>                    return rc;
>>>>                }
>>>>            }
>>>> +        spin_unlock(&tmp->vpci_lock);
>>>>        }
>>> At the first glance this simply looks like another unjustified (in the
>>> description) change, as you're not converting anything here but you
>>> actually add locking (and I realize this was there before, so I'm sorry
>>> for not pointing this out earlier).
>> Well, I thought that the description already has "...the lock can be
>> used (and in a few cases is used right away) to check whether vpci
>> is present" and this is enough for such uses as here.
>>>    But then I wonder whether you
>>> actually tested this, since I can't help getting the impression that
>>> you're introducing a live-lock: The function is called from cmd_write()
>>> and rom_write(), which in turn are called out of vpci_write(). Yet that
>>> function already holds the lock, and the lock is not (currently)
>>> recursive. (For the 3rd caller of the function - init_bars() - otoh
>>> the locking looks to be entirely unnecessary.)
>> Well, you are correct: if tmp != pdev then it is correct to acquire
>> the lock. But if tmp == pdev and rom_only == true
>> then we'll deadlock.
>>
>> It seems we need to have the locking conditional, e.g. only lock
>> if tmp != pdev
> Which will address the live-lock, but introduce ABBA deadlock potential
> between the two locks.
I am not sure I can suggest a better solution here
@Roger, @Jan, could you please help here?
>
>>>> @@ -222,10 +239,10 @@ static int msix_read(struct vcpu *v, unsigned long 
>>>> addr, unsigned int len,
>>>>                break;
>>>>            }
>>>>    
>>>> +        msix_put(msix);
>>>>            return X86EMUL_OKAY;
>>>>        }
>>>>    
>>>> -    spin_lock(&msix->pdev->vpci->lock);
>>>>        entry = get_entry(msix, addr);
>>>>        offset = addr & (PCI_MSIX_ENTRY_SIZE - 1);
>>> You're increasing the locked region quite a bit here. If this is really
>>> needed, it wants explaining. And if this is deemed acceptable as a
>>> "side effect", it wants justifying or at least stating imo. Same for
>>> msix_write() then, obviously.
>> Yes, I do increase the locking region here, but the msix variable needs
>> to be protected all the time, so it seems to be obvious that it remains
>> under the lock
> What does the msix variable have to do with the vPCI lock? If you see
> a need to grow the locked region here, then surely this is independent
> of your conversion of the lock, and hence wants to be a prereq fix
> (which may in fact want/need backporting).
First of all, the implementation of msix_get is wrong and needs to be:

/*
  * Note: if vpci_msix found, then this function returns with
  * pdev->vpci_lock held. Use msix_put to unlock.
  */
static struct vpci_msix *msix_get(const struct domain *d, unsigned long addr)
{
     struct vpci_msix *msix;

     list_for_each_entry ( msix, &d->arch.hvm.msix_tables, next )
     {
         const struct vpci_bar *bars;
         unsigned int i;

         spin_lock(&msix->pdev->vpci_lock);
         if ( !msix->pdev->vpci )
         {
             spin_unlock(&msix->pdev->vpci_lock);
             continue;
         }

         bars = msix->pdev->vpci->header.bars;
         for ( i = 0; i < ARRAY_SIZE(msix->tables); i++ )
             if ( bars[msix->tables[i] & PCI_MSIX_BIRMASK].enabled &&
                  VMSIX_ADDR_IN_RANGE(addr, msix->pdev->vpci, i) )
                 return msix;

         spin_unlock(&msix->pdev->vpci_lock);
     }

     return NULL;
}

Then, both msix_{read|write} can dereference msix->pdev->vpci early,
this is why Roger suggested we move to msix_{get|put} here.
And yes, we grow the locked region here and yes this might want a
prereq fix. Or just be fixed while at it.

>
>>>> @@ -327,7 +334,12 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, 
>>>> unsigned int size)
>>>>        if ( !pdev )
>>>>            return vpci_read_hw(sbdf, reg, size);
>>>>    
>>>> -    spin_lock(&pdev->vpci->lock);
>>>> +    spin_lock(&pdev->vpci_lock);
>>>> +    if ( !pdev->vpci )
>>>> +    {
>>>> +        spin_unlock(&pdev->vpci_lock);
>>>> +        return vpci_read_hw(sbdf, reg, size);
>>>> +    }
>>> Didn't you say you would add justification of this part of the change
>>> (and its vpci_write() counterpart) to the description?
>> Again, I am referring to the commit message as described above
> No, sorry - that part applies only to what inside the parentheses of
> if(). But on the intermediate version (post-v5 in a 4-patch series) I
> did say:
>
> "In this case as well as in its write counterpart it becomes even more
>   important to justify (in the description) the new behavior. It is not
>   obvious at all that the absence of a struct vpci should be taken as
>   an indication that the underlying device needs accessing instead.
>   This also cannot be inferred from the "!pdev" case visible in context.
>   In that case we have no record of a device at this SBDF, and hence the
>   fallback pretty clearly is a "just in case" one. Yet if we know of a
>   device, the absence of a struct vpci may mean various possible things."
>
> If it wasn't obvious: The comment was on the use of vpci_read_hw() on
> this path, not redundant with the earlier one regarding the added
> "is vpci non-NULL" in a few places.
Ok
>
> Jan
>
Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.