[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain



> -----Original Message-----
> From: Jürgen Groß <jgross@xxxxxxxx>
> Sent: 10 December 2019 08:12
> To: Jan Beulich <jbeulich@xxxxxxxx>; Tian, Kevin <kevin.tian@xxxxxxxxx>;
> Durrant, Paul <pdurrant@xxxxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxxxxxxxxx; Roger Pau Monné <roger.pau@xxxxxxxxxx>; Wei
> Liu <wl@xxxxxxx>
> Subject: Re: [PATCH v2] x86 / iommu: set up a scratch page in the
> quarantine domain
> 
> On 10.12.19 09:05, Jan Beulich wrote:
> > On 10.12.2019 08:16, Tian, Kevin wrote:
> >>> From: Jan Beulich <jbeulich@xxxxxxxx>
> >>> Sent: Tuesday, December 3, 2019 5:36 PM
> >>>
> >>> On 28.11.2019 12:32, Jürgen Groß wrote:
> >>>> On 28.11.19 12:17, Jan Beulich wrote:
> >>>>> On 27.11.2019 18:11, Paul Durrant wrote:
> >>>>>> This patch introduces a new iommu_op to facilitate a per-
> >>> implementation
> >>>>>> quarantine set up, and then further code for x86 implementations
> >>>>>> (amd and vtd) to set up a read-only scratch page to serve as the
> source
> >>>>>> for DMA reads whilst a device is assigned to dom_io. DMA writes
> will
> >>>>>> continue to fault as before.
> >>>>>>
> >>>>>> The reason for doing this is that some hardware may continue to re-
> try
> >>>>>> DMA (despite FLR) in the event of an error, or even BME being
> cleared,
> >>> and
> >>>>>> will fail to deal with DMA read faults gracefully. Having a scratch
> page
> >>>>>> mapped will allow pending DMA reads to complete and thus such buggy
> >>>>>> hardware will eventually be quiesced.
> >>>>>>
> >>>>>> NOTE: These modifications are restricted to x86 implementations
> only as
> >>>>>>         the buggy h/w I am aware of is only used with Xen in an x86
> >>>>>>         environment. ARM may require similar code but, since I am
> not
> >>>>>>         aware of the need, this patch does not modify any ARM
> >>> implementation.
> >>>>>>
> >>>>>> Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx>
> >>>>>
> >>>>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> >>>>>
> >>>>>> There is still the open question of whether use of a scratch page
> ought
> >>>>>> to be gated on something, either are run-time or compile-time.
> >>>>>
> >>>>> I have no clear opinion either way here. The workaround seems low
> >>>>> overhead enough that there may not be a need to have an admin (or
> >>>>> build time) control for this.
> >>>>>
> >>>>> As to 4.13: The quarantining as a whole is pretty fresh. While it
> >>>>> has been backported to security maintained trees, I'd still consider
> >>>>> it a new feature in 4.13, and hence this workaround at least
> eligible
> >>>>> for consideration.
> >>>>
> >>>> I agree.
> >>>>
> >>>> Release-acked-by: Juergen Gross <jgross@xxxxxxxx>
> >>>
> >>> I notice this has been committed meanwhile. I had specifically not
> >>> done so due to the still missing VT-d ack, seeing that this wasn't
> >>> an entirely "trivial" change.
> >>>
> >>
> >> While the quarantine idea sounds good overall, I'm still not convinced
> >> to have it the only way in place just for handling some known-buggy
> >> device. It kills the possibility of identifying a new buggy device and
> then
> >> deciding not to use it in the first space... I thought about whether it
> >> will get better when future IOMMU implements A/D bit - by checking
> >> access bit being set then we'll know some buggy device exists, but,
> >> the scratch page is shared by all devices then we cannot rely on this
> >> feature to find out the actual buggy one.
> >
> > Thinking about it - yes, I think I agree. This (as with so many
> > workarounds) would better be an off-by-default one. The main issue
> > I understand this would have is that buggy systems then might hang
> > without even having managed to get a log message out - Paul?
> >

Yes, that's the concern. Some systems will wedge hard on the first fault with 
no logging. I have no objection to a command line parameter but IMO it ought to 
default on, at least in non-debug mode.

  Paul


> > Jürgen - would you be amenable to an almost last minute refinement
> > here (would then also need to still be backported to 4.12.2, or
> > the original backport reverted, to avoid giving the impression of
> > a regression)?
> 
> So what is your suggestion here? To have a boot option (defaulting to
> off) for enabling the scratch page?
> 
> 
> Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.