Xen project Mailing List

Re: [Xen-devel] [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

Date: Fri, 17 Jun 2016 06:08:48 +0000

Accept-language: en-US

Cc: "dario.faggioli@xxxxxxxxxx" <dario.faggioli@xxxxxxxxxx>, "Wu, Feng" <feng.wu@xxxxxxxxx>, "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Fri, 17 Jun 2016 06:09:04 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHRu+VkiiPoYLz2Xk6csXJHfhDISp/Vf/aAgBZLfFD//5LXAIAB45fw

Thread-topic: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue

On June 16, 2016 5:05 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote: > >>> On 16.06.16 at 10:42, <quan.xu@xxxxxxxxx> wrote: > > On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote: > >> >>> On 01.06.16 at 11:05, <quan.xu@xxxxxxxxx> wrote: > >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did, > >> > + struct pci_ats_dev > >> > +*ats_dev) { > >> > + struct domain *d = NULL; > >> > + struct pci_dev *pdev; > >> > + > >> > + if ( test_bit(did, iommu->domid_bitmap) ) > >> > + d = rcu_lock_domain_by_id(iommu->domid_map[did]); > >> > + > >> > + /* > >> > + * In case the domain has been freed or the IOMMU domid bitmap is > >> > + * not valid, the device no longer belongs to this domain. > >> > + */ > >> > + if ( d == NULL ) > >> > + return; > >> > + > >> > + pcidevs_lock(); > >> > + > >> > + for_each_pdev(d, pdev) > >> > + { > >> > + if ( (pdev->seg == ats_dev->seg) && > >> > + (pdev->bus == ats_dev->bus) && > >> > + (pdev->devfn == ats_dev->devfn) ) > >> > + { > >> > + ASSERT(pdev->domain); > >> > + list_del(&pdev->domain_list); > >> > + pdev->domain = NULL; > >> > + pci_hide_existing_device(pdev); > >> > + break; > >> > + } > >> > + } > >> > + > >> > + pcidevs_unlock(); > >> > >> ... this loop (and locking). (Of course such a change may better be > >> done in another preparatory patch.) > >> > > > > To eliminate the locking? I am afraid the locking is still a must > > here even without the loop, also referring to device_assigned().. > > If the entire loop gets eliminated, what would be left is > > pcidevs_lock(); > pcidevs_unlock(); > > which I don't think does any good. > Why? I can't follow it.. > >> > +static int __must_check dev_invalidate_sync(struct iommu *iommu, > >> > +u16 > >> did, > >> > + struct pci_ats_dev > >> > +*ats_dev) { > >> > + struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu); > >> > + int rc = 0; > >> > + > >> > + if ( qi_ctrl->qinval_maddr ) > >> > + { > >> > + rc = queue_invalidate_wait(iommu, 0, 1, 1, 1); > >> > + > >> > + if ( rc == -ETIMEDOUT ) > >> > + dev_invalidate_iotlb_timeout(iommu, did, ats_dev); > >> > + } > >> > + > >> > + return rc; > >> > +} > >> > >> I've never really understood why invalidate_sync() returns success > >> when it didn't do anything. Now that you copy this same behavior > >> here, I really need to ask you to explain that. > >> > > > > It is acceptable to me, returning success when it didn't do anything > > -- this is worth reflection and criticism:(.. > > It is better: > > + if ( qi_ctrl->qinval_maddr ) > > + ... > > + else > > + rc = -ENOENT; > > Right. And perhaps a separate patch to make invalidate_sync() do the same. Agreed. > Question is whether this really ought to be a conditional, or whether instead > this code is unreachable when qinval is not in use, in which case these > conditionals would imo better be converted to ASSERT()s. > IMO, this should be a conditional. As mentioned below, strictly speaking, this is a bug. We can't ASSERT() based on a bug.. A coming patch may fix it.. > > A question: > > I find the page related to qi_ctrl->qinval_maddr is not freed at all. > > IMO, In disable_qinval (), we need to do: > > - free the page related to qi_ctrl->qinval_maddr. > > - qi_ctrl->qinval_maddr = 0; > > Well, that's a correct observation, but not a big problem imo: If this was a > per- > domain resource, it surely would need fixing. But if freeing a couple of these > pages (one per IOMMU) causes synchronization headaches (e.g. because > there may still be dangling references to it), then I think freeing them is > not a > must. But if freeing them is safe (like you seem to imply), then I'm certainly > fine with you fixing this (not that my opinion would matter much here, as I'm > not the maintainer of this code). > Agreed, thanks for your explanation. At least I will leave it as is in this patch set. Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.