[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.



>>> On 14.01.16 at 02:22, <kevin.tian@xxxxxxxxx> wrote:
>>  From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: Wednesday, January 13, 2016 7:03 PM
>> 
>> >> Jan,
>> >> Patch 2/3 and Patch 3/3 are based on v3 (Actually they are v3's patch 1/2
>> > and patch 2/2).
>> >> We have discussed how to hide a device with pci_hide_device() when 
> Device-TLB
>> > flush is
>> >> error.
>> >>
>> >> Now there are 2 concerns:
>> >>       1. Hide the PCI device may break the code path. (then the 
>> >> pdev->domain
>> > is
>> >> dom_xen)
>> >>       2. Is the blew logic right?
>> >>            --If Device-TLB flush is timeout, we'll hide the target ATS 
>> >> device
>> > and crash the
>> >> domain owning this ATS device.
>> >>              If impacted domain is hardware domain, just throw out a
>> > warning, instead of
>> >> crash the hardware domain.
>> >>             The hided Device will be disallowed to be further assigned to
>> > any domain.
>> >>
>> >> Kevin, correct me if I am wrong.
>> >>
>> >>
>> >
>> > for 2, yes it's the policy we agreed in previous discussion.
>> >
>> > for 1, after more thinking I think the concern is real. pci_hide_device
>> > is used once in early boot-up phase. At that time, it's simple to just
>> > have right owner configured. However in the path of normal device
>> > assign/deassign, there are tons of more state associated which may
>> > be related to the owner. Though we may do some special handling
>> > in related code paths to have dom_xen specially handled, it's way
>> > tricky and not easy to maintain.
>> 
>> I don't buy this without one of you pointing out the actual
>> difficulties: The domain is being crashed anyway, so there's no
>> true device de-assignment needed (IOMMU tables will get torn
>> down no matter what).
>> 
> 
> Here is one example of a quick glimpse:
> 
> static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)
> {
>       [...]
> 
>     ret = domain_context_mapping(pdev->domain, devfn, pdev);
>     if ( ret )
>     {
>         dprintk(XENLOG_ERR VTDPREFIX, "d%d: context mapping failed\n",
>                 pdev->domain->domain_id);
>         return ret;
>     }
> 
> If pdev->domain is changed inside due to pci_hide_device, the error
> message will be inaccurate.

But that's a thing trivial to fix up. You had talked about tricky things...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.