[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] Re-enable MSI support

[Yunhong Jiang]
> Espen Skoglund <mailto:espen.skoglund@xxxxxxxxxxxxx> wrote:
>> [Kevin Tian]
>>>> From: Espen Skoglund
>>>> Sent: Wednesday, December 10, 2008 7:34 PM
>>>> [Yunhong Jiang]
>>>>> This patch try to do some cleanup for these issues.
>>>>> 1) The basic idea is to remove the pci_dev's lock, instead, we
>>>>>    try to use the big pcidevs_lock to protect the whole pci_dev
>>>>>    stuff. It including both pci_dev adding/removing, and also
>>>>>    the assignment of the devices. We checked the code and seems
>>>>>    there is no critical code path for this. We try to use
>>>>>    fine-grained lock and seems the code will be very tricky.
>>>>> 2) Split the pci_enable_msi into two step, firstly it will just
>>>>>    construct the msi_desc through pci_enable_msi without holding
>>>>>    the irq_desc lock, and then it will setup msi through
>>>>>    setup_msi_irq with irq_desc holded.
>>>>> 3) Change the iommu->lock and hd->mapping_lock to be irq_save.
>>>>> 4) Fix to some minor issues.
>>>>> Now the lock sequence is: pcidevs_lock -> domai's event_lock ->
>>>>> iommu's lock -> hvm_iommu's mapping_lock. The irq_desc's lock
>>>>> will always be the last lock be hold for peformance
>>>>> consideration.
>>>> So what exactly is it that pcidevs_lock is supposed to "protect"
>>>> now?  Does it indicate that someone is holding a reference to a
>>>> pci_dev?  Does it indicate that someone will modify some pci_dev?
>>>> Does it indicate that someone is messing around with interrupts
>>>> or MSI descriptors?
>>> I think it protects all above. As those operations are all rare,
>>> such a big lock can avoid complex lock/unlock sequence regarding
>>> to different paths to different resource of an assigned device.
>> Except, since it is used as a read_lock most of the time it does
>> not actually protect things in the way a spinlock would do.

> Ahh, yes, I didn't realize this. So how do you think changing it to
> spinlock, since it is not performance ciritical. Or do you know how
> much benifit for read_lock?

It may not be performance critical, but I don't particularly like
seeing big global locks introduced.  By the looks of it this lock
would also protect everything related to interrupt management.

> Also, as for the reference to pci_dev, do you have plan to add such
> support? For example, I'm not sure if we can return fail for
> pci_remove_device if there is still reference to it? Will dom0
> support such failure?

The reason I opted for a pdev->spinlock rather than refcount was that
removing a pdev with a refcount is kind of messy.  The pdev is not an
anonymous struct that can be released by, e.g., an RCU like mechanism.
It is intrinsically tied with a BDF, and we can not create a new pdev
for that BDF until the previous pdev struct has been completely

The idea with pcidevs_lock was to only protect the pci_dev lists
itself.  The lock order of pcidevs_lock and pdev->lock only matters
when write_lock is involved.  So, the follwoing two combinations are

    read_lock(pcidevs_lock) => spin_lock(pdev)
    spin_lock(pdev) => read_lock(pcidevs_lock)

The following combinations are not:

    read_lock(pcidevs_lock) => spin_lock(pdev)
    spin_lock(pdev) => write_lock(pcidevs_lock)

There is only one place in Xen where the last case occurs (the
location you identified in reassing_device_ownership).  However, this
code can easily be fixed so that it is deadlock free.

The idea behind pdev->lock was to protect agains concurrent updates
and usages relating to a particular pdev --- including assignment to
domains, IOMMU assignment, and MSI assignment.  My guess is that where
things broke down was where the MSI related pdev->locking interferred
with the other interrupt related locking.  I don't have a clear enough
picture of inetrrupt related locks to pinpoint the problem more


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.