[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
Automated testing on Xen-4.3 testing tip found an interesting issue (XEN) *** DOUBLE FAULT *** (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: 0000000000000020 rcx: 0000000000000100 (XEN) rdx: ffff82c4c032dfc0 rsi: ffff83043f2c6068 rdi: ffff83043f2c6008 (XEN) rbp: ffff83043f2c6048 rsp: ffff83043f2c6000 r8: 0000000000000001 (XEN) r9: 0000000000000000 r10: ffff83043f2c76f0 r11: 0000000000000000 (XEN) r12: ffff83043f2c6008 r13: 7fffffffffffffff r14: ffff83043f2c6068 (XEN) r15: 000003343036797b cr0: 0000000080050033 cr4: 00000000000026f0 (XEN) cr3: 0000000403c40000 cr2: ffff83043f2c5ff8 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0 (XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000): (XEN) ffff83043f2c6008: [<ffff82c4c01aa73d>] cpuidle_wakeup_mwait+0x2d/0xba (XEN) ffff83043f2c6058: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c60c8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c60d8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c61a8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6230: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6268: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c62d8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c62e8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c63b8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6440: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6478: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6498: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c64e8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c64f8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6550: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 (XEN) ffff83043f2c6568: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f (XEN) ffff83043f2c65c8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6650: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6688: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c66f8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6708: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c67d8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6860: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6898: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6908: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6918: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6938: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c6988: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c69e8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6a70: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6aa8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6b18: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6c80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 (XEN) ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 (XEN) ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f (XEN) ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41 (XEN) ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f (XEN) ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c6f28: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 (XEN) ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d (XEN) ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde (XEN) ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239 (XEN) ffff83043f2c7108: [<ffff82c4c01e9497>] __get_gfn_type_access+0x12b/0x20e (XEN) ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d (XEN) ffff83043f2c71c8: [<ffff82c4c01f4660>] map_domain_gfn_3_levels+0x43/0x13a (XEN) ffff83043f2c7208: [<ffff82c4c01f4b6b>] guest_walk_tables_3_levels+0x414/0x489 (XEN) ffff83043f2c7288: [<ffff82c4c0223988>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306 (XEN) ffff83043f2c7338: [<ffff82c4c0223b35>] hap_gva_to_gfn_3_levels+0x1f/0x2a (XEN) ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc (XEN) ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d (XEN) ffff83043f2c73c8: [<ffff82c4c01b6d34>] hvmemul_virtual_to_linear+0x102/0x153 (XEN) ffff83043f2c7408: [<ffff82c4c01c1538>] hvm_copy_from_guest_virt+0x15/0x17 (XEN) ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8 (XEN) ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14 (XEN) ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10 (XEN) ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309 (XEN) ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f (XEN) ffff83043f2c75c0: [<ffff82c4c012a100>] _spin_trylock_recursive+0x63/0x93 (XEN) ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f (XEN) ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) ffff83043f2c7668: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 (XEN) ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde (XEN) ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde (XEN) ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 (XEN) ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2 (XEN) ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde (XEN) ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 (XEN) ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf (XEN) ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60 (XEN) ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0 (XEN) ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cb (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456 (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7 (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839 (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85 (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139 (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7 (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) DOUBLE FAULT -- system shutdown (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... The hpet interrupt handler runs with interrupts enabled, due to this the spin_unlock_irq() in: while ( desc->status & IRQ_PENDING ) { desc->status &= ~IRQ_PENDING; spin_unlock_irq(&desc->lock); tsc_in = tb_init_done ? get_cycles() : 0; action->handler(irq, action->dev_id, regs); TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); spin_lock_irq(&desc->lock); } in do_IRQ(). Clearly there are cases where the frequency of the HPET interrupt is faster than the time it takes to process handle_hpet_broadcast(), I presume in part because of the large amount of cpumask manipulation. One solution, presented in this patch is to disable interrupts while running the hpet event handler, but the patch is RFC because I dont really like it as it feels a little bit like a hack. handle_hpet_broadcast() is clearly too long as an interrupt handler, and perhaps effort should be made to reduce it if possible. Then again, interrupt handlers for this (and other Xen-consumed interrupts?) should probably be run with interrupts disabled, to prevent exactly these kinds of problems. Thoughts/comments? CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> CC: Keir Fraser <keir@xxxxxxx> CC: Jan Beulich <jbeulich@xxxxxxxx> CC: Tim Deegan <tim@xxxxxxx> --- xen/arch/x86/hpet.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 946d133..46e14a9 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -229,7 +229,9 @@ static void hpet_interrupt_handler(int irq, void *data, return; } + local_irq_disable(); ch->event_handler(ch); + local_irq_enable(); } static void hpet_msi_unmask(struct irq_desc *desc) -- 1.7.10.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |