[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/boot: Make alternative patching NMI-safe
On 05/02/18 14:09, Jan Beulich wrote: >>>> On 05.02.18 at 11:24, <andrew.cooper3@xxxxxxxxxx> wrote: >> During patching, there is a very slim risk that an NMI or MCE interrupt in >> the >> middle of altering the code in the NMI/MCE paths, in which case bad things >> will happen. >> >> The NMI risk can be eliminated by running the patching loop in NMI context, >> at >> which point the CPU will defer further NMIs until patching is complete. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > So you continue to think that the risk of hitting an #MC here is > acceptable, despite there being a solution to the problem. To be > honest, I find this a little strange. (I do agree that there's no > good solution to the similar live patching problem.) The risk is already sufficiently tiny that in 3 years, it hasn't been observed, nor do I think it is likely to be observed in the future. At this point on boot, there is nothing interesting set up, which further reduces the risk of an MCE. Furthermore, whether or not Xen survives the MCE (and don't believe I've ever seen Xen successfully recover from an MCE), the hardware is faulty and needs replacing (modulo cosmic rays, but the chances of those really are astronomical). Irrespective of that, there is no way I'm aware of for generating MCEs on demand, and therefore, no way of testing the logic. For that reason alone, I don't think it is wise to be taking complicated invasive logic. > >> @@ -232,13 +254,14 @@ void __init alternative_instructions(void) >> */ >> ASSERT(!local_irq_is_enabled()); >> >> - /* Disable WP to allow application of alternatives to read-only pages. >> */ >> - write_cr0(cr0 & ~X86_CR0_WP); >> - >> - apply_alternatives(__alt_instructions, __alt_instructions_end); >> + /* >> + * As soon as the callback is set up, the next NMI will trigger >> patching, >> + * even an NMI ahead of our explicit self-NMI. >> + */ >> + saved_nmi_callback = set_nmi_callback(nmi_apply_alternatives); >> >> - /* Reinstate WP. */ >> - write_cr0(cr0); >> + /* Send ourselves an NMI to trigger the callback. */ >> + self_nmi(); >> >> set_nmi_callback(saved_nmi_callback); >> } > Hmm, you restore the old callback and return without having waited > for the patching to actually occur? Or are you implying that the > delivery of the NMI is guaranteed to be fully synchronous on all > hardware? I'm not aware of the LAPIC spec saying anything like this. I hadn't even considered that. In practice, NMIs always appear synchronously following the ICR write. I expect this probably wasn't the case for off-chip APICs. In this case, I can move the once boolean to being at file scope and check for that. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |