[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.



Automated testing on Xen-4.3 testing tip found an interesting issue

(XEN) *** DOUBLE FAULT ***
(XEN) ----[ Xen-4.3.0  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    3
(XEN) RIP:    e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f
(XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: 0000000000000020   rcx: 0000000000000100
(XEN) rdx: ffff82c4c032dfc0   rsi: ffff83043f2c6068   rdi: ffff83043f2c6008
(XEN) rbp: ffff83043f2c6048   rsp: ffff83043f2c6000   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: ffff83043f2c76f0   r11: 0000000000000000
(XEN) r12: ffff83043f2c6008   r13: 7fffffffffffffff   r14: ffff83043f2c6068
(XEN) r15: 000003343036797b   cr0: 0000000080050033   cr4: 00000000000026f0
(XEN) cr3: 0000000403c40000   cr2: ffff83043f2c5ff8
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, 
sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0
(XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000):
(XEN)    ffff83043f2c6008: [<ffff82c4c01aa73d>] cpuidle_wakeup_mwait+0x2d/0xba
(XEN)    ffff83043f2c6058: [<ffff82c4c01a7a29>] 
handle_hpet_broadcast+0x1b0/0x268
(XEN)    ffff83043f2c60c8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN)    ffff83043f2c60d8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN)    ffff83043f2c61a8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c6230: [<ffff82c4c012a535>] 
_spin_unlock_irqrestore+0x40/0x42
(XEN)    ffff83043f2c6268: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN)    ffff83043f2c62d8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN)    ffff83043f2c62e8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN)    ffff83043f2c63b8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c6440: [<ffff82c4c012a535>] 
_spin_unlock_irqrestore+0x40/0x42
(XEN)    ffff83043f2c6478: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN)    ffff83043f2c6498: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN)    ffff83043f2c64e8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN)    ffff83043f2c64f8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN)    ffff83043f2c6550: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
(XEN)    ffff83043f2c6568: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN)    ffff83043f2c65c8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c6650: [<ffff82c4c012a535>] 
_spin_unlock_irqrestore+0x40/0x42
(XEN)    ffff83043f2c6688: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN)    ffff83043f2c66f8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN)    ffff83043f2c6708: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN)    ffff83043f2c67d8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c6860: [<ffff82c4c012a535>] 
_spin_unlock_irqrestore+0x40/0x42
(XEN)    ffff83043f2c6898: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN)    ffff83043f2c6908: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN)    ffff83043f2c6918: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN)    ffff83043f2c6938: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN)    ffff83043f2c6988: [<ffff82c4c01a7a29>] 
handle_hpet_broadcast+0x1b0/0x268
(XEN)    ffff83043f2c69e8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c6a70: [<ffff82c4c012a535>] 
_spin_unlock_irqrestore+0x40/0x42
(XEN)    ffff83043f2c6aa8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN)    ffff83043f2c6b18: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN)    ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN)    ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c6c80: [<ffff82c4c012a535>] 
_spin_unlock_irqrestore+0x40/0x42
(XEN)    ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN)    ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN)    ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN)    ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41
(XEN)    ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f
(XEN)    ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN)    ffff83043f2c6f28: [<ffff82c4c01a7a29>] 
handle_hpet_broadcast+0x1b0/0x268
(XEN)    ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN)    ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d
(XEN)    ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde
(XEN)    ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239
(XEN)    ffff83043f2c7108: [<ffff82c4c01e9497>] 
__get_gfn_type_access+0x12b/0x20e
(XEN)    ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d
(XEN)    ffff83043f2c71c8: [<ffff82c4c01f4660>] 
map_domain_gfn_3_levels+0x43/0x13a
(XEN)    ffff83043f2c7208: [<ffff82c4c01f4b6b>] 
guest_walk_tables_3_levels+0x414/0x489
(XEN)    ffff83043f2c7288: [<ffff82c4c0223988>] 
hap_p2m_ga_to_gfn_3_levels+0x178/0x306
(XEN)    ffff83043f2c7338: [<ffff82c4c0223b35>] 
hap_gva_to_gfn_3_levels+0x1f/0x2a
(XEN)    ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc
(XEN)    ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d
(XEN)    ffff83043f2c73c8: [<ffff82c4c01b6d34>] 
hvmemul_virtual_to_linear+0x102/0x153
(XEN)    ffff83043f2c7408: [<ffff82c4c01c1538>] 
hvm_copy_from_guest_virt+0x15/0x17
(XEN)    ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8
(XEN)    ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14
(XEN)    ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10
(XEN)    ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309
(XEN)    ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN)    ffff83043f2c75c0: [<ffff82c4c012a100>] 
_spin_trylock_recursive+0x63/0x93
(XEN)    ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN)    ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN)    ffff83043f2c7668: [<ffff82c4c01a7a29>] 
handle_hpet_broadcast+0x1b0/0x268
(XEN)    ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN)    ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN)    ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
(XEN)    ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2
(XEN)    ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN)    ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
(XEN)    ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf
(XEN)    ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60
(XEN)    ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0
(XEN)    ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cb
(XEN)    ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d
(XEN)    ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e
(XEN)    ffff83043f2c7c98: [<ffff82c4c01bccf3>] 
hvm_hap_nested_page_fault+0x25d/0x456
(XEN)    ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba
(XEN)    ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7
(XEN)    ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a
(XEN)    ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
(XEN)    ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839
(XEN)    ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
(XEN)    ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85
(XEN)    ffff83043f2c7e88: [<ffff82c4c01c50a7>] 
hvm_vcpu_has_pending_irq+0x9b/0xcd
(XEN)    ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139
(XEN)    ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 3:
(XEN) DOUBLE FAULT -- system shutdown
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

The hpet interrupt handler runs with interrupts enabled, due to this the
spin_unlock_irq() in:

    while ( desc->status & IRQ_PENDING )
    {
        desc->status &= ~IRQ_PENDING;
        spin_unlock_irq(&desc->lock);
        tsc_in = tb_init_done ? get_cycles() : 0;
        action->handler(irq, action->dev_id, regs);
        TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
        spin_lock_irq(&desc->lock);
    }

in do_IRQ().

Clearly there are cases where the frequency of the HPET interrupt is faster
than the time it takes to process handle_hpet_broadcast(), I presume in part
because of the large amount of cpumask manipulation.

One solution, presented in this patch is to disable interrupts while running
the hpet event handler, but the patch is RFC because I dont really like it as
it feels a little bit like a hack.

handle_hpet_broadcast() is clearly too long as an interrupt handler, and
perhaps effort should be made to reduce it if possible.

Then again, interrupt handlers for this (and other Xen-consumed interrupts?)
should probably be run with interrupts disabled, to prevent exactly these
kinds of problems.

Thoughts/comments?

CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
CC: Keir Fraser <keir@xxxxxxx>
CC: Jan Beulich <jbeulich@xxxxxxxx>
CC: Tim Deegan <tim@xxxxxxx>
---
 xen/arch/x86/hpet.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index 946d133..46e14a9 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -229,7 +229,9 @@ static void hpet_interrupt_handler(int irq, void *data,
         return;
     }
 
+    local_irq_disable();
     ch->event_handler(ch);
+    local_irq_enable();
 }
 
 static void hpet_msi_unmask(struct irq_desc *desc)
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.