[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
On 06/08/13 09:01, Jan Beulich wrote: >>>> On 05.08.13 at 22:38, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: >> Automated testing on Xen-4.3 testing tip found an interesting issue >> >> (XEN) *** DOUBLE FAULT *** >> (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]---- > The call trace is suspicious in ways beyond what Keir already > pointed out - with debug=y, there shouldn't be bogus entries listed, > yet ... show_stack_overflow() doesn't have a debug case which follows frame pointers. I shall submit a patch for this presently, and put it into XenServer in the hope of getting a better stack trace in the future. <snip> > And this one looks bogus too. Question therefore is whether the > problem you describe isn't a consequence of an earlier issue. There is nothing apparently interesting preceding the crash. Just some spew from an HVM domain using the 0x39 debug port. > >> (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d >> (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] >> __get_gfn_type_access+0x94/0x20e >> (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] >> hvm_hap_nested_page_fault+0x25d/0x456 >> (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] >> vmx_vmexit_handler+0x140a/0x17ba >> (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7 >> (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a >> (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 >> (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839 >> (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 >> (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] >> vlapic_has_pending_irq+0x3f/0x85 >> (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] >> hvm_vcpu_has_pending_irq+0x9b/0xcd >> (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139 >> (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7 >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 3: >> (XEN) DOUBLE FAULT -- system shutdown >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> The hpet interrupt handler runs with interrupts enabled, due to this the >> spin_unlock_irq() in: >> >> while ( desc->status & IRQ_PENDING ) >> { >> desc->status &= ~IRQ_PENDING; >> spin_unlock_irq(&desc->lock); >> tsc_in = tb_init_done ? get_cycles() : 0; >> action->handler(irq, action->dev_id, regs); >> TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); >> spin_lock_irq(&desc->lock); >> } >> >> in do_IRQ(). >> >> Clearly there are cases where the frequency of the HPET interrupt is faster >> than the time it takes to process handle_hpet_broadcast(), I presume in part >> because of the large amount of cpumask manipulation. > How many CPUs (and how many usable HPET channels) does the > system have that this crash was observed on? > > Jan The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. The full boot xl dmesg is attached, but it appears that the are 8 broadcast hpets. This is futher backed up by the 'i' debugkey (also attached) Keir: (merging your thread back here) I see your point regarding IRQ_INPROGRESS, but even with 8 hpet interrupts, there are rather more than 8 occurences of handle_hpet_broadcast() in the stack. If the occurences were just function pointers on the stack, I would expect to see several handle_hpet_broadcast()+0x0/0x268 ~Andrew Attachment:
xl-dmesg-boot Attachment:
xl-debugkeys-i _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |