[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.

>>> On 24.03.16 at 10:02, <quan.xu@xxxxxxxxx> wrote:
> On March 18, 2016 5:49pm, <JBeulich@xxxxxxxx> wrote:
>> That was taking only the flush timeout as an error source into account.
>> Now that we see that the lack of error handling pre-exists, we can't just 
>> extend
>> that intended model to also cover those other error reasons without at least
>> having given people a chance to object.
> For this abstract model, 
> I assume we are on the same page for the precondition:
> If Device-TLB flush timed out, we would hide the target ATS device and crash 
> the domain owning this ATS device. 
> If impacted domain is hardware domain, just throw out a warning.
> Then IMO,
>    1. Try the best to return error code.
>    2. Log error and don't return error value for hardware_domain init or 
> crashed system shutdown.
>    3. For iommu_{,un}map_page(), we'd better fix it as a normal error, as 
> the error is not only from iommu flush, .e.g, '-ENOMEM'.
>      So, we need to {,un}map from the IOMMU, return an error, and roll back 
> the failed operation( .e.g, unmap EPT).

Well, if that possible in a provably correct way, then sure. But be
clear - when the failure occurs while unmapping, unmapping the
EPT entry obviously can't be the solution, you'd need a true
roll back. And of course you should keep in mind what happens to
the guest if such an operation fails: If you can be certain it'll crash
because of this later on anyway, you're likely better off crashing
it right away (such that the reason for the crash is at least obvious).


>    4. for the rest, we may return an error, but don't roll back the failed 
> operation, and we need to analysis the different condition.
> Quan

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.