[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Woes of NMIs and MCEs, and possibly how to fix



>>> On 30.11.12 at 18:34, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> 1) Faults on the NMI path will re-enable NMIs before the handler
> returns, leading to reentrant behaviour.  We should audit the NMI path
> to try and remove any needless cases which might fault, but getting a
> fault-free path will be hard (and is not going so solve the reentrant
> behaviour itself).
> 
> 2) Faults on the MCE path will re-enable NMIs, as will the iret of the
> MCE itself if an MCE interrupts an NMI.

As apparently converged to later in the thread - we just need to
exclude the potential for faults inside the NMI and MCE paths. The
only reason I could this needing to change would be if we intended
to add an extensive NMI producer like native Linux'es perf
subsystem.

> 3) SMM mode executing an iret will re-enable NMIs.  There is nothing we
> can do to prevent this, and as an SMI can interrupt NMIs and MCEs, no
> way to predict if/when it may happen.  The best we can do is accept that
> it might happen, and try to deal with the after effects.

I don't see us needing to deal with that in any way. SMM using IRET
carelessly is just plain wrong. Iirc SMM (just like VMEXIT) has a save/
restore field for the NMI mask, so if they make proper use of this,
there should be no problem.

> 4) "Fake NMIs" can be caused by hardware with access to the INTR pin
> (very unlikely in modern systems with the LAPIC supporting virtual wire
> mode), or by software executing an `int $0x2`.  This can cause the NMI
> handler to run on the NMI stack, but without the normal hardware NMI
> cessation logic being triggered.
> 
> 5) "Fake MCEs" can be caused by software executing `int $0x18`, and by
> any MSI/IOMMU/IOAPIC programmed to deliver vector 0x18.  Normally, this
> could only be caused by a bug in Xen, although it is also possible on a
> system with out interrupt remapping. (Where the host administrator has
> accepted the documented security issue, and decided still to pass-though
> a device to a trusted VM, and the VM in question has a buggy driver for
> the passed-through hardware)

Fake exceptions, as was also already said by others, are a Xen or
hardware bug and hence shouldn't need extra precautions either.

> 9) The NMI handler when returning to ring3 will leave NMIs latched, as
> it uses the sysret path.

This is a little imprecise: The problem is only when entering the
scheduler on the way out of an NMI, and resuming an unaware
PV vCPU on the given pCPU. Apart from forcing an IRET in that
case early (we can't be on the special NMI stack in that case, as
the NMI entry path switches to the normal stack when entered
from PV guest context, entry from VMX context happens on the
normal stack anyway, and entry from hypervisor context [which
includes the SVM case] doesn't end up handling softirqs on the
exit path), another option would be to clear the TRAP_syscall
flag when resuming a PV vCPU in the scheduler.

But the early IRET solution has other benefits (keeping the NMI
disabled window short), so would be preferable imo.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.