Xen project Mailing List

Re: [Xen-devel] [PATCH] iommu: leave IOMMU enabled by default during kexec crash transition

To: "Igor Druzhinin" <igor.druzhinin@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 20 Feb 2019 01:55:38 -0700

Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>

Delivery-date: Wed, 20 Feb 2019 08:57:46 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

>>> On 19.02.19 at 13:31, <igor.druzhinin@xxxxxxxxxx> wrote: > On 19/02/2019 07:43, Jan Beulich wrote: >>>>> On 18.02.19 at 19:30, <andrew.cooper3@xxxxxxxxxx> wrote: >>> On 18/02/2019 16:21, Igor Druzhinin wrote: >>>> It's unsafe to disable IOMMU on a live system which is the case >>>> if we're crashing since remapping hardware doesn't usually know what >>>> to do with ongoing bus transactions and frequently raises NMI/MCE/SMI, >>>> etc. (depends on the firmware configuration) to signal these abnormalities. >>>> This, in turn, doesn't play well with kexec transition process as there is >>>> no any handling available at the moment for this kind of events resulting >>>> in failures to enter the kernel. >>>> >>>> Modern Linux kernels taught to copy all the necessary DMAR/IR tables >>>> following kexec from the previous kernel (Xen in our case) - so it's >>>> currently normal to keep IOMMU enabled. It would only require to change >>>> crash kernel command line by enabling IOMMU drivers from the existing >>>> users. >> >> Is this the normal option ("intel_iommu=on" in the Intel case), or >> rather something special? Considering that you explicitly talk about >> Linux here anyway, mentioning the option(s) explicitly would seem >> to make sense. >> > > Yes, it's intel_iommu / amd_iommu (which is enabled by default now). Not as far as I can tell, at least in the Intel case: There's still CONFIG_INTEL_IOMMU_DEFAULT_ON. (As an aside, while unlikely, it's also still possible to build kernels without IOMMU support inn the first place.) > I'll give an example in the commit message. Thanks. >>>> An option is left for compatibility with ancient crash kernels which >>>> didn't like to have IOMMU active under their feet on boot. >>>> >>>> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >>> >>> To provide a bit of extra background, it turns out that in hindsight, >>> turning off the IOMMU in a crash usually makes things worse rather than >>> better. >> >> For an unknown definition of "usually". Corrupted (IOMMU) page >> tables are not really an impossible crash reason. >> > > Well, that still falls into the definition of "usually". Having IOMMU > pages corrupted given proper isolation of devices sounds unlikely. Everything, absolutely everything is possible as a cause for a crash. I don't see why device isolation would matter here at all. Page table corruption (be it IOMMU or CPU one) can be caused by malfunctioning kernel code, entirely unrelated to what devices do. >>> In particular, any guest with a PCI device which happens to allocate a >>> DMA buffer in GFN space which matches the crash region in MFN space will >>> end up corrupting the crash kernel when DMA remapping gets turned off. >> >> Indeed, but that's only PVH Dom0 (unsupported as of yet) or PV >> Dom0 using PV IOMMU functionality (not even in tree as of yet). So >> the question is how applicable this change really is at this point in >> time. I notice it hasn't been tagged for 4.12, so please don't take >> this as objection to it going in - I'm only trying to better understand >> the implications. >> > > I think the more prevalent case that I wanted to cover here is the one > described in the commit message: assuming you have ongoing DMAs anywhere > in the system with IOMMU enabled hardware will raise an event (that > we're unable to handle) as soon as you turn IOMMU translation off. This > happens so commonly now that we cannot simply tolerate it happening > sporadically. Well, as validly pointed out by Andrew, I did wrongly limit my thinking to Dom0 only here. So please disregard this part of my reply. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.