[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.



On March 18, 2016 5:49pm, <JBeulich@xxxxxxxx> wrote:
> >>> On 18.03.16 at 10:38, <dario.faggioli@xxxxxxxxxx> wrote:
> > On Fri, 2016-03-18 at 03:29 -0600, Jan Beulich wrote:
> >> >
> >> Not sure what exactly you're asking for: As said, we first need to
> >> settle on an abstract model. Do we want IOMMU mapping failures to be
> >> fatal to the domain (perhaps with the exception of the hardware one)?
> >> I think we do, and for the hardware domain we'd do things on a best
> >> effort basis (always erring on the side of unmapping). Which would
> >> probably mean crashing the domain could be centralized in
> >> iommu_{,un}map_page(). How much roll back would then still be needed
> >> in callers of these functions for the hardware domain's sake would
> >> need to be seen.
> >>
> >> So before you start coing, give others (namely but not limited to
> >> VT-d, AMD IOMMU, other x86, and x86/mm maintainers) a chance to voice
> >> differing opinions.
> >>
> > FWIW, the behavior Jan described
> > (crashing the domain for all domains but the hardware domain) was
> > indeed the intended plan for this series, as far as I understood from
> > talking to people and looking at previous email conversations and
> > submissions.
> 
> That was taking only the flush timeout as an error source into account.
> Now that we see that the lack of error handling pre-exists, we can't just 
> extend
> that intended model to also cover those other error reasons without at least
> having given people a chance to object.
> 

For this abstract model, 
I assume we are on the same page for the precondition:
If Device-TLB flush timed out, we would hide the target ATS device and crash 
the domain owning this ATS device. 
If impacted domain is hardware domain, just throw out a warning.

Then IMO,
   1. Try the best to return error code.
   2. Log error and don't return error value for hardware_domain init or 
crashed system shutdown.
   3. For iommu_{,un}map_page(), we'd better fix it as a normal error, as the 
error is not only from iommu flush, .e.g, '-ENOMEM'.
     So, we need to {,un}map from the IOMMU, return an error, and roll back the 
failed operation( .e.g, unmap EPT).
   4. for the rest, we may return an error, but don't roll back the failed 
operation, and we need to analysis the different condition.

Quan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.