[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
>>> On 15.10.15 at 10:52, <yang.z.zhang@xxxxxxxxx> wrote: > Jan Beulich wrote on 2015-10-15: >>>>> On 15.10.15 at 09:28, <yang.z.zhang@xxxxxxxxx> wrote: >>> The premise for a misbehaving guest to impact the system is that the >>> IOMMU is buggy which takes long time to complete the invalidation. >>> In other words, if all invalidations are able to complete within >>> several us, what's the matter to do with the spin time? >> >> The risk of exploits of such poorly behaving IOMMUs. I.e. if properly > > But this is not a software flaw. A guest has no way to know the underlying > IOMMU is wrong and it cannot exploit it. A guest doesn't need to know what IOMMU is there in order to try some exploit. Plus - based other information it may be able to make an educated guess. >> operating IOMMUs only require several us, why spin for several ms? > > 10ms is just my suggestion. I don't know whether future hardware will need > more time to complete the invalidation. So I think we need to have a large > enough timeout here. Meanwhile, doesn't impact the scheduling. It does, as explained further down in my previous reply. >>>>> I remember the origin motivation to handle ATS problem is due to: 1. >>>>> ATS spec allow 60s timeout to completed the flush which Xen only >>>>> allows 1s, and 2. spin loop for 1s is not reasonable since it will >>>>> hurt the scheduler. For the former, as we discussed before, either >>>>> disable ATS support or only support some specific ATS >>>>> devices(complete the flush less than 10ms or 1ms) is acceptable. >>>>> For the latter, if spin loop for 1s is not acceptable, we can >>>>> reduce the timeout to 10ms or 1ms >>>> to eliminate the performance impaction. >>>> >>>> If we really can, why has it been chosen to be 1s in the first place? >>> >>> What I can tell is 1s is just the value the original author chooses. >>> It has no special means. I have double check with our hardware >>> expert and he suggests us to use the value as small as possible. >>> According his comment, 10ms is sufficiently large. >> >> So here you talk about milliseconds again, while above you talked >> about microsecond. Can we at least settle on an order of what is >> required? 10ms is >> 10 times the minimum time slice credit1 allows, i.e. >> awfully long. > > We can use an appropriate value which you think reasonable which can cover > most of invalidation cases. For left cases, the vcpu can yield the CPU to > others until a timer fired. In callback function, hypervisor can check > whether the invalidation is completed. If yes, schedule in the vcpu. > Otherwise, kill the guest due to unpredictable invalidation timeout. Using a timer implies you again think about pausing the vCPU until the invalidation completes. Which, as discussed before, has its own problems and, even worse, won't cover the domain's other vCPU-s or devices still possibly doing work involving the use of the being invalidated entries. Or did you have something else in mind? IOW - as soon as spinning time reaches the order of the scheduler time slice, I think the only sane model is async operation with proper refcounting. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |