[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VT-d spin loops



On 10/07/14 00:22, Tian, Kevin wrote:
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: Thursday, June 26, 2014 3:30 AM
>>
>> All,
>>
>> VT-d code currently has a number of cases where completion of certain
>> operations is being waited for by way of spinning. The various instances
>> can be identified relatively easily by grep-ing for all uses of
>> DMAR_OPERATION_TIMEOUT (the majority of instances use that
>> variable indirectly through IOMMU_WAIT_OP()), allowing for loops of
>> up to 1 second. While in many of the cases this _may_ be acceptable
>> (which would need to be proven for each individual case, also taking into
>> consideration how many of these spinning loops may be executed in a
>> row with no preemption/scheduling in between), the invalidation case
>> seems particularly problematic: Using DMAR_OPERATION_TIMEOUT is
>> a mistake here in the first place, as the timeout here isn't related to
>> response times by the IOMMU engine. Instead - with ATS in use - the
>> specification mandates a timeout of 1 _minute_ (with a 50% slack, the
>> meaning of which none of us [Andrew and Malcolm brought this issue
>> to my attention in private talks on the hackathon] was able to really
>> interpret in a sensible way).
> yes, that's not a good design. Most waits happen in IOMMU initialization,
> where 1s timeout is less a big issue. At runtime cache/tlb flush and root
> entry manipulation are definitely not good with long spinning .
>
>> So there are two things that need doing rather sooner than later:
>>
>> First and foremost the ATS case needs to be taken into consideration
>> when doing invalidations. Obviously we can't spin for a minute, so
>> invalidation absolutely needs to be converted to a non-spinning model.
>> We realize this isn't going to be trivial, which is why we bring this up
>> here rather than coming forward with a patch right away.
> ATS should be fine. Device TLB can ONLY be validated through qinval
> interface, which is asynchronous so no need to consider 1 minute timeout
> even in current spinning model.

There are currently no asynchronous invalidations in Xen. ATS certainly
is a problem.

>
>> Second, looking at Linux (which interestingly enough also spins, and
>> that even without any timeout) there are flags in the fault status
>> register that can be used to detect at least some loop abort conditions.
>> We should definitely make use of anything that can shorten these
>> spinning loops (as was already done in commit dd6d87a4 ["VT-d: drop
>> redundant calls to invalidate_sync()"] as a very tiny first step). The
>> main problem with trying to clone at least some of the functionality
>> from Linux is that I'm not convinced the replaying they do can
>> actually do anything good. Plus it's clear that - spinning or not - the
>> consequences of an invalidation request not completing successfully
>> need to be taken care of (and it's of no help that in all cases I looked
>> at so far errors passed up from the leaf functions sooner or later
>> get dropped on the floor or mis-interpreted).
>>
>> And finally, all other spinning instances need to be audited to make
>> sure they can't add up to multiple-second spins (iirc we can't
>> tolerate more than about 4s without running into time problems on
>> certain hardware).
>>
> In general yes a non-spinning model is better, but it requires non-trivial
> change to make all IOMMU operations asynchronous. If ATS is not a
> concern, is it still worthy of change besides auditing existing usages?

Even if the invalidation is only at the IOMMU, waiting milliseconds for
the completion is still time better spent elsewhere, such as running VMs.

Do you have any numbers for typical completion times for invalidate
requests?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.