|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
>>> On 23.12.15 at 09:25, <quan.xu@xxxxxxxxx> wrote:
> --- a/xen/drivers/passthrough/vtd/qinval.c
> +++ b/xen/drivers/passthrough/vtd/qinval.c
> @@ -190,9 +190,19 @@ static int queue_invalidate_wait(struct iommu *iommu,
> static int invalidate_sync(struct iommu *iommu)
> {
> struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> + int rc;
>
> if ( qi_ctrl->qinval_maddr )
> - return queue_invalidate_wait(iommu, 0, 1, 1);
> + {
> + rc = queue_invalidate_wait(iommu, 0, 1, 1);
> + if ( rc )
> + {
> + if ( rc == -ETIMEDOUT )
> + panic("Queue invalidate wait descriptor was timeout.\n");
Unless I'm overlooking something this doesn't replace and existing
panic(), and I think I had indicated quite clearly that this series
shouldn't add any new ones, unless the alternative of crashing
the owning domain is completely unfeasible.
> + panic("Queue invalidate wait descriptor was timeout.\n");
> + return rc;
> + }
> + }
> +
> return 0;
> }
Please avoid introducing multiple return points when this can be
trivially avoided.
> @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu,
> return 0;
> }
>
> +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> + u16 seg, u8 bus, u8 devfn)
> +{
> + struct domain *d;
> + struct pci_dev *pdev;
> +
> + d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> + ASSERT(d);
Don't you need to acquire some lock in order to safely assert this?
Also note that unused slots hold zero, i.e. there's at least a
theoretical risk of problems here when you don't also look at
iommu->domid_bitmap.
> + for_each_pdev(d, pdev)
> + {
> + if ( (pdev->seg == seg) &&
> + (pdev->bus == bus) &&
> + (pdev->devfn == devfn) )
> + {
> + if ( pdev->domain )
If the device is on the domain's list, can this reasonably be false?
I.e. don't you rather mean ASSERT() here?
> + {
> + list_del(&pdev->domain_list);
This should happen under pcidevs_lock - you need to either acquire
it or ASSERT() it being held.
> + pdev->domain = NULL;
> + }
> +
> + if ( pci_hide_device(bus, devfn) )
> + {
> + printk(XENLOG_ERR
> + "IOMMU hide device %04x:%02x:%02x error.",
> + seg, bus, devfn);
> + break;
> + }
> +
> + break;
> + }
Redundant breaks.
> + }
> +
> + if ( !is_hardware_domain(d) )
> + domain_crash(d);
I wonder whether the device hiding shouldn't also be avoided if
the device is owned by hwdom.
> @@ -349,9 +416,18 @@ static int flush_iotlb_qi(
> queue_invalidate_iotlb(iommu,
> type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
> dw, did, size_order, 0, addr);
> +
> + /*
> + * Synchronize with hardware for invalidation request descriptors
> + * submitted before Device-TLB invalidate descriptor.
> + */
> + rc = invalidate_sync(iommu);
> + if ( rc )
> + return rc;
> +
> if ( flush_dev_iotlb )
> ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
> - rc = invalidate_sync(iommu);
> +
> if ( !ret )
> ret = rc;
> }
This change is because of ...?
> --- a/xen/drivers/passthrough/vtd/x86/ats.c
> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> @@ -162,6 +162,19 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
> return -EOPNOTSUPP;
> }
>
> + /*
> + * Synchronize with hardware for Device-TLB invalidate
> + * descriptor.
> + */
> + ret = dev_invalidate_iotlb_sync(iommu, did, pdev->seg,
> + pdev->bus, pdev->devfn);
> + if ( ret )
> + {
> + dprintk(XENLOG_WARNING VTDPREFIX,
> + "Device-TLB flush timeout.\n");
Are you aware that dprintk() compiles to nothing in non-debug builds?
Plus logging the error code and device coordinates might turn out
helpful in such cases. But first of all - if there was a timeout, we'd
make it here only for Dom0. Perhaps the printing should move into
an else to the domain_crash()? And if there was another error, the
message would be outright wrong.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |