[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] Re-enable MSI support



xen-devel-bounces@xxxxxxxxxxxxxxxxxxx <> wrote:
> [Yunhong Jiang]
>> Espen Skoglund <mailto:espen.skoglund@xxxxxxxxxxxxx> wrote:
>>> [Kevin Tian]
>>>>> From: Espen Skoglund
>>>>> Sent: Wednesday, December 10, 2008 7:34 PM
>>>>> 
>>>>> [Yunhong Jiang]
>>>>>> This patch try to do some cleanup for these issues.
>>>>>> 1) The basic idea is to remove the pci_dev's lock, instead, we
>>>>>>    try to use the big pcidevs_lock to protect the whole pci_dev
>>>>>>    stuff. It including both pci_dev adding/removing, and also
>>>>>>    the assignment of the devices. We checked the code and seems
>>>>>>    there is no critical code path for this. We try to use
>>>>>>    fine-grained lock and seems the code will be very tricky.
>>>>>> 2) Split the pci_enable_msi into two step, firstly it will just
>>>>>>    construct the msi_desc through pci_enable_msi without holding
>>>>>>    the irq_desc lock, and then it will setup msi through
>>>>>>    setup_msi_irq with irq_desc holded.
>>>>>> 3) Change the iommu->lock and hd->mapping_lock to be irq_save.
>>>>>> 4) Fix to some minor issues.
>>>>> 
>>>>>> Now the lock sequence is: pcidevs_lock -> domai's event_lock ->
>>>>>> iommu's lock -> hvm_iommu's mapping_lock. The irq_desc's lock
>>>>>> will always be the last lock be hold for peformance
>>>>>> consideration.
>>>>> 
>>>>> So what exactly is it that pcidevs_lock is supposed to "protect"
>>>>> now?  Does it indicate that someone is holding a reference to a
>>>>> pci_dev?  Does it indicate that someone will modify some pci_dev?
>>>>> Does it indicate that someone is messing around with interrupts
>>>>> or MSI descriptors?
>>> 
>>>> I think it protects all above. As those operations are all rare,
>>>> such a big lock can avoid complex lock/unlock sequence regarding
>>>> to different paths to different resource of an assigned device.
>>> 
>>> Except, since it is used as a read_lock most of the time it does
>>> not actually protect things in the way a spinlock would do.
> 
>> Ahh, yes, I didn't realize this. So how do you think changing it to
>> spinlock, since it is not performance ciritical. Or do you know how
>> much benifit for read_lock?
> 
> It may not be performance critical, but I don't particularly like
> seeing big global locks introduced.  By the looks of it this lock
> would also protect everything related to interrupt management.

The interrupt should be protected by irq_desc's lock.
The content will be used by interrupt including the mask register and the 
message (data/address) registers, which should always be protected by  
irq_desc, others will not be used by ISR, so is interrupt safe and can be 
protected by pcidevs_lock.

Or you noticed anything wrong in the code that violate the rule?

> 
> 
>> Also, as for the reference to pci_dev, do you have plan to add such
>> support? For example, I'm not sure if we can return fail for
>> pci_remove_device if there is still reference to it? Will dom0
>> support such failure?
> 
> The reason I opted for a pdev->spinlock rather than refcount was that
> removing a pdev with a refcount is kind of messy.  The pdev is not an
> anonymous struct that can be released by, e.g., an RCU like mechanism.
> It is intrinsically tied with a BDF, and we can not create a new pdev
> for that BDF until the previous pdev struct has been completely purged.
> 
> The idea with pcidevs_lock was to only protect the pci_dev lists
> itself.  The lock order of pcidevs_lock and pdev->lock only matters
> when write_lock is involved.  So, the follwoing two combinations are fine:
> 
>    read_lock(pcidevs_lock) => spin_lock(pdev)
>    spin_lock(pdev) => read_lock(pcidevs_lock)
> 
> The following combinations are not:
> 
>    read_lock(pcidevs_lock) => spin_lock(pdev)
>    spin_lock(pdev) => write_lock(pcidevs_lock)
> 
> There is only one place in Xen where the last case occurs (the
> location you identified in reassing_device_ownership).  However, this
> code can easily be fixed so that it is deadlock free.

As you stated in another mail, more deadlock may exists for pcidevs_lock.

> 
> The idea behind pdev->lock was to protect agains concurrent updates
> and usages relating to a particular pdev --- including assignment to
> domains, IOMMU assignment, and MSI assignment.  My guess is that where
> things broke down was where the MSI related pdev->locking interferred
> with the other interrupt related locking.  I don't have a clear enough
> picture of inetrrupt related locks to pinpoint the problem more exectly.

I suspect it is more about MSI related pdev->locking. I'd list how information 
protected currently:
1) device list, i.e. the alldevs_list, wihch is portected currently by 
pcidevs_lock;
2) device assignment, i.e. pci_dev->lock, which is protected by pci_dev->lock, 
if I understand correctly
3) IOMMU assignment, i.e. domain->arch.pdev_list, which is protected by 
pcidevs_lock currently
4) MSI assignment, i.e. pci_dev->msi_list, which is protected by pci_dev->lock.

I think the issue is on item 3/4.
For Item 3, because the iommu assignment is protected by a pcidevs_lock, it 
will cause reassign_device_ownership deadlock. If you do want fine grained 
lock, maybe you can consider per-domain lock like domain->arch.pdev_lock :-)
For item 4, I think the reason the issue comes on multiple vector MSI support.

Thanks
Yunhong Jiang

> 
> 
>       eSk
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.