[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
Jan Beulich wrote on 2015-10-15: >>>> On 15.10.15 at 03:03, <yang.z.zhang@xxxxxxxxx> wrote: >> Jan Beulich wrote on 2015-10-14: >>> As long as the multi-millisecond spins aren't going to go away by >>> other means, I think conversion to async mode is ultimately unavoidable. >> >> I am not fully agreed. I think the time to spin is important. To me, >> less than 1 ms is acceptable and if the hardware can guarantee it, >> then sync mode also is ok. > > Okay, let me put the condition slightly differently - any spin on the > order of what a WBINVD might take ought to be okay, provided both are From the data we collected, the invalidation is completed within several us. IMO, the time for WBINVD is varying due the size and different cache hierarchies. And it may take more than several us in worst case. > equally (in)accessible to guests. The whole discussion is really about > limiting the impact misbehaving guests can have on the whole system. > (Obviously any spin time reaching the order of a scheduling time slice > is a problem.) The premise for a misbehaving guest to impact the system is that the IOMMU is buggy which takes long time to complete the invalidation. In other words, if all invalidations are able to complete within several us, what's the matter to do with the spin time? > >> I remember the origin motivation to handle ATS problem is due to: 1. >> ATS spec allow 60s timeout to completed the flush which Xen only >> allows 1s, and 2. spin loop for 1s is not reasonable since it will >> hurt the scheduler. For the former, as we discussed before, either >> disable ATS support or only support some specific ATS >> devices(complete the flush less than 10ms or 1ms) is acceptable. For >> the latter, if spin loop for 1s is not acceptable, we can reduce the >> timeout to 10ms or 1ms > to eliminate the performance impaction. > > If we really can, why has it been chosen to be 1s in the first place? What I can tell is 1s is just the value the original author chooses. It has no special means. I have double check with our hardware expert and he suggests us to use the value as small as possible. According his comment, 10ms is sufficiently large. > >> Yes, I'd agree it would be best solution if Xen has the async mode. >> But spin loop is used widely in iommu code: not only for >> invalidations, lots of DMAR operations are using spin to sync >> hardware's status. For those operations, it is hard to use async mode. >> Or, even it is possible to use async mode, I don't see the benefit >> considering the cost and complexity which means we either need a >> timer or a > softirq to do the check. > > Even if the cost is high, limited overall throughput by undue spinning > is worth it imo even outside of misbehaving guest considerations. I'm > surprised you're not getting similar pressure on this from the KVM > folks (assuming the use of spinning is similar there). Because no one observe such invalidation timeout issue so far. What we have discussed are only in theory. btw, I have told the issue to Linux IOMMU maintainer but he didn't say anything on it. Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |