Xen project Mailing List

Re: [Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()

To: "Razvan Cojocaru" <rcojocaru@xxxxxxxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Mon, 29 May 2017 05:05:42 -0600

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Mon, 29 May 2017 11:06:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 29.05.17 at 11:20, <rcojocaru@xxxxxxxxxxxxxxx> wrote: > On 05/26/17 18:11, Andrew Cooper wrote: >> On 26/05/17 15:29, Jan Beulich wrote: >>>>>> On 25.05.17 at 11:40, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>> I've noticed that, with pages marked NX and vm_event emulation, we can >>>> end up emulating an ud2, for which hvm_emulate_one() returns >>>> X86EMUL_EXCEPTION in hvm_emulate_one_vm_event(). >>> Could you explain what would lead to emulation of UD2? >>> >>>> This, in turn, causes a hvm_inject_event() call in the context of >>>> hvm_do_resume(), which can, if there's already a pending event there, >>>> cause a 101 BSOD (timer-related, if I understand correctly) or loss of >>>> input (mouse frozen, keyboard unresponsive). >>>> >>>> After much trial and error, I've been able to confirm this by leaving a >>>> guest on for almost a full day with this change: >>>> >>>> case X86EMUL_EXCEPTION: >>>> - hvm_inject_event(&ctx.ctxt.event); >>>> + if ( !hvm_event_pending(current) ) >>>> + hvm_inject_event(&ctx.ctxt.event); >>>> >>>> and checking that there's been no BSOD or loss of input. >>>> >>>> However, just losing the event here, while fine to prove that this is >>>> indeed the problem, is not OK. But I'm not sure what an elegant / robust >>>> way of fixing this is. >>> Much depends on what the other event is: If it's an interrupt, I'd >>> assume there to be an ordering problem (interrupts shouldn't be >>> injected when there is a pending exception, their delivery instead >>> should be attempted on the first instruction of the exception >>> handler [if interrupts remain on] or whenever interrupts get >>> re-enabled). >> >> I suspect it is an ordering issue, and something has processed and >> interrupt before the emulation occurs as part of the vm_event reply happens. >> >> The interrupt ordering spec indicates that external interrupts take >> precedent over faults raised from executing an instruction, on the basis >> that once the interrupt handler returns, the instruction will generate >> the same fault again. However, its not obvious how this is intended to >> interact with interrupt windows and vmexits. I expect we can get away >> with ensuring that external interrupts are the final thing considered >> for injection on the return-to-guest path. >> >> It might be an idea to leave an assert in vmx_inject_event() that an >> event is not already pending, but in the short term, this probably also >> wants debugging by trying to identify what sequence of actions is >> leading us to inject two events in this case (if indeed this is what is >> happening). > > With some patience, I've been able to catch the problem: "(XEN) > vmx_inject_event(3, 14) but 0, 225 pending". > > 63 /* > 64 * x86 event types. This enumeration is valid for: > 65 * Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8] > 66 * AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only) > 67 */ > 68 enum x86_event_type { > 69 X86_EVENTTYPE_EXT_INTR, /* External interrupt */ > 70 X86_EVENTTYPE_NMI = 2, /* NMI */ > 71 X86_EVENTTYPE_HW_EXCEPTION, /* Hardware exception */ > 72 X86_EVENTTYPE_SW_INTERRUPT, /* Software interrupt (CD nn) */ > 73 X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */ > 74 X86_EVENTTYPE_SW_EXCEPTION, /* INT3 (CC), INTO (CE) */ > 75 }; > > so an X86_EVENTTYPE_EXT_INTR is pending when we're trying to inject an > X86_EVENTTYPE_HW_EXCEPTION, as a result of the code quoted above. So this confirms our suspicion, but doesn't move us closer to a solution. The question after all is why an external interrupt is being delivered prior to or while emulating. As Andrew did explain, proper behavior would be to check for external interrupts and don't enter emulation if one is pending, or don't check for external interrupts until the _next_ instruction boundary. Correct architectural behavior will result either way; the second variant merely must not continuously defer interrupts (i.e. there need to be instruction boundaries at which hardware of software do check for them). I'm not that familiar with the sequence of steps when dealing with emulation requests from an introspection agent, so I would hope you could go through those code paths to see where external interrupts are being checked for. Or wait - isn't your problem that you invoke emulation out of hvm_do_resume() (via hvm_vm_event_do_resume() -> hvm_emulate_one_vm_event()), which happens after all other "normal" processing of a VM exit? Perhaps emulation should be skipped there if an event is already pending injection, as emulation not having started means we still are on an instruction boundary? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.