[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.



>>> On 21.01.16 at 17:16, <quan.xu@xxxxxxxxx> wrote:
>>  On January 20, 2016 at 7:29 pm,  <JBeulich@xxxxxxxx> wrote:
>> >>> On 20.01.16 at 11:26, <quan.xu@xxxxxxxxx> wrote:
>> >> On January 15, 2016 at 9:10, <JBeulich@xxxxxxxx> wrote:
>> >> >>> On 23.12.15 at 09:25, <quan.xu@xxxxxxxxx> wrote:
>> >> > @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu,
>> >> >      return 0;
>> >> >  }
>> >> >
>> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
>> >> > +                                         u16 seg, u8 bus, u8
>> >> > +devfn)
>> >> {
>> >> > +    struct domain *d;
>> >> > +    struct pci_dev *pdev;
>> >> > +
>> >> > +    d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> >> > +    ASSERT(d);
>> >>
>> >> Don't you need to acquire some lock in order to safely assert this?
>> >
>> > Referred to the other use case of 'rcu_lock_domain_by_id()', Xen
>> > checks whether The domain is 'NULL'.
>> > Could I also replace the 'ASSERT(d)' with
>> >           + If ( d == NULL )
>> >           +   return;
>> > ?
>> 
>> At a first glance this doesn't look right, but in the end that's something 
> you need
>> to answer.
>> 
> 
> Is it quite similar to whether the domain has been destroyed when Device-TLB 
> is flushing?

Going back to the original question I raised, you need to ask
yourself: Can d validly be NULL when you get here (e.g. due to
some other processor changing something in parallel)? If yes,
you can't ASSERT(), and the next question then is what exactly
it means that it got ripped off the table. The answer to that
determines whether simply returning would be okay.

>> >> Also note that unused slots hold zero, i.e. there's at least a
>> >> theoretical
>> > risk of
>> >> problems here when you don't also look at
>> >> iommu->domid_bitmap.
>> >>
>> > I am not clear how to fix this point. Do you have good idea?
>> > Add a lock to 'iommu->domid_bitmap'?
>> 
>> How would a lock help avoiding mistaking an unused slot to mean Dom0? As
>> already suggested - I think you simply need to consult the bitmap along with 
> the
>> domain ID array.
>> 
> 
> Try to get domain id with iommu->domid_map[did] ?

???

>> >> > +            {
>> >> > +                list_del(&pdev->domain_list);
>> >>
>> >> This should happen under pcidevs_lock - you need to either acquire it
>> >> or
>> >> ASSERT() it being held.
>> >>
>> >
>> > I should check whether it is under pcidevs_lock -- with
>> > spin_is_locked(&pcidevs_lock)
>> > If it is not under pcidevs_lock, I should acquire it.
>> > I think ASSERT() is not a good idea. Hypervisor acquires this lock and
>> > then remove the resource.
>> 
>> I don't understand this last sentence.
>> 
> For example: in 
> pci_remove_device()
> {
> ...
>     spin_lock(&pcidevs_lock);
>     ..
>     iommu_remove_device()..
>     ..
>     spin_unlock(&pcidevs_lock);
> }
> 
> Device-TLB is maybe flush error in iommu_remove_device()..
> Then it is under pcidevs_lock..
> In this case, I need to check whether it is under pcidevs_lock.
> If not, I need to acquire the pcidevs_lock.

Ah, okay. But you can't use spin_is_locked() for that purpose.

>> >> > @@ -349,9 +416,18 @@ static int flush_iotlb_qi(
>> >> >          queue_invalidate_iotlb(iommu,
>> >> >                                 type >>
>> >> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
>> >> >                                 dw, did, size_order, 0, addr);
>> >> > +
>> >> > +        /*
>> >> > +         * Synchronize with hardware for invalidation request
>> descriptors
>> >> > +         * submitted before Device-TLB invalidate descriptor.
>> >> > +         */
>> >> > +        rc = invalidate_sync(iommu);
>> >> > +        if ( rc )
>> >> > +             return rc;
>> >> > +
>> >> >          if ( flush_dev_iotlb )
>> >> >              ret = dev_invalidate_iotlb(iommu, did, addr, size_order,
>> type);
>> >> > -        rc = invalidate_sync(iommu);
>> >> > +
>> >> >          if ( !ret )
>> >> >              ret = rc;
>> >> >      }
>> >>
>> >> This change is because of ...?
>> >>
>> > As Kevin's comments,
>> > I add 'dev_invalidate_iotlb_sync()' to dev_invalidate_iotlb().
>> > Then i can know which device is flush timeout.
>> 
>> I don't understand: How is your reply related to you moving up the 
> invocation of
>> invalidate_sync()?
> 
> This invalidate_sync() is to synchronize with hardware for invalidation 
> request descriptors
> submitted before Device-TLB invalidate descriptor.
> 
> If I don't add this invalidate_sync().. 
> dev_invalidate_iotlb_timeout() is not clear whether the flush time out is 
> about Device-TLB flush error or the other previous iotlb/iec/context flush 
> error. 

This being anything but obvious, maybe a brief comment would
help?

>> >> Plus logging the error code and device coordinates might turn out
>> >> helpful in such cases. But first of all - if there was a timeout,
>> >> we'd make it here only for Dom0. Perhaps the printing should move into an
>> else to the domain_crash()?
>> >> And if there was another error, the message would be outright wrong.
>> >>
>> > IMO, the print for Dom0 is enough.
>> > I think it is not a good idea to move into an else to domain_crash().
>> > Domain_crash is quite a common part.
>> > Anyway I can improve it in low priority.
>> 
>> At the very least the message should not end up being actively misleading.
>> 
> What about the below?
> +                printk(XENLOG_ERR
> +                       "Device %04x:%02x:%02x is flush error.",
> +                       seg, bus, devfn);

Well, for one you once again omit the error code. And then (I think
others had pointed that out already) a device cannot be flush error.
Perhaps you mean "Flush error %d on device ...\n"? Plus you
absolutely should print PCI device coordinates in the canonical way,
so that grep-ing a log file for issues with a specific device will find
everything related to that device.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.