[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device
Jan Beulich wrote on 2015-10-12: >>>> On 12.10.15 at 03:42, <yang.z.zhang@xxxxxxxxx> wrote: >> According the discussion and suggestion you made in past several >> weeks, obviously, it is not an easy task. So I am wondering whether >> it is worth to do it since: >> 1. ATS device is not popular. I only know one NIC from Myricom has >> ATS capabilities. >> 2. The issue is only in theory. Linux, Windows, VMware are all using >> spin now as well as Xen, but none of them observed any problem so far. >> 3. I know there is propose to modify the timeout value(maybe less in >> 1 >> ms) in ATS spec to mitigate the problem. But the risk is how long to achieve >> it. >> 4. The most important concern is it is too complicated to fix it in >> Xen since it needs to modify the core memory part. And I don't think >> Quan and i have the enough knowledge to do it perfectly currently. >> It may take long time, half of year or one year?(We have spent three >> months so far). Yes, if Tim likes to take it. It will be much fast. >> :) >> >> So, my suggestion is that we can rely on user to not assign the ATS >> device if hypervisor says it cannot support such device. For >> example, if hypervisor find the invalidation isn't completed in 1 >> second, then hypervisor can crash itself and tell the user this ATS >> device needs more than 1 second invalidation time which is not support by >> Xen. > > Crashing the hypervisor in such a case is a security issue, i.e. is not Indeed. Crashing the guest is more reasonable. > an acceptable thing (and the fact that we panic() on timeout expiry > right now isn't really acceptable either). If crashing the offending > guest was sufficient to contain the issue, that might be an option. Else I think it should be sufficient (any concern from you?). Hypervisor can crash the guest with hint that the device may need long time to complete the invalidation or device maybe bad. And user should add the device to a blacklist to disallow assignment again. > ripping out ATS support (and limiting spin time below what there is > currently) may be the only alternative to fixing it. Yes, it is another solution considering ATS device is rare currently. For spin time, 10ms should be enough in both two solutions. But if solution 1 is acceptable, I prefer it since most of ATS devices are still able to play with Xen. Best regards, Yang _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |