[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: Kernel BUG at arch/x86/mm/tlb.c:61
2011/4/14 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>: > Hi: > > I've done test with "cpuidle=0 cpufreq=none", two machine crashed. > > blktap_sysfs_destroy > blktap_sysfs_destroy > blktap_sysfs_create: adding attributes for dev ffff8800ad581000 > blktap_sysfs_create: adding attributes for dev ffff8800a48e3e00 > ------------[ cut here ]------------ > kernel BUG at arch/x86/mm/tlb.c:61! > invalid opcode: 0000 [#1] SMP > last sysfs file: /sys/block/tapdeve/dev > CPU 0 > Modules linked in: 8021q garp blktap xen_netback xen_blkback blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_ms > ghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy bnx2 > serio_raw snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 snd_timer i2c_core snd iT > CO_wdt pata_acpi soundcore iTCO_vendor_ > support ata_generic snd_page_alloc pcspkr ata_piix shpchp mptsas mptscsih mptbase [last unloa > ded: freq_table] > Pid: 8022, comm: khelper Not tainted 2.6.32.36xen #1 Tecal RH2285 > RIP: e030:[<ffffffff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46 > RSP: e02b:ffff88002803ee48 EFLAGS: 00010046 > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81675980 > RDX: ffff88002803ee78 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffff88002803ee48 R08: ffff8800a4929000 R09: dead000000200200 > R10: dead000000100100 R11: ffffffff81447292 R12: ffff88012ba07b80 > R13: ffff880028046020 R14: 00000000000004fb R15: 0000000000000000 > FS: 00007f410af416e0(0000) GS:ffff88002803b000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000469000 CR3: 00000000ad639000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process khelper (pid: 8022, threadinfo ffff8800a4846000, task ffff8800a9ed0000) > Stack: > ffff88002803ee68 ffffffff8100e4a4 0000000000000001 ffff880097de3b88 > <0> ffff88002803ee98 ffffffff81087224 ffff88002803ee78 ffff88002803ee78 > <0> ffff88015f808180 00000000000004fb ffff88002803eea8 ffffffff810100e8 > Call Trace: > <IRQ> > [<ffffffff8100e4a4>] drop_other_mm_ref+0x2a/0x53 > [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc > [<ffffffff810100e8>] xen_call_function_single_interrupt+0x13/0x28 > [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 > [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e > [<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1ab/0x27d > [<ffffffff8128dcf9>] xen_evtchn_do_upcall+0x33/0x46 > [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30 > <EOI> > [<ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17 > [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff81113f75>] ? flush_old_exec+0x3ac/0x500 > [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef > [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef > [<ffffffff81151161>] ? load_elf_binary+0x398/0x17ef > [<ffffffff81042fcf>] ? need_resched+0x23/0x2d > > [<ffffffff811f463c>] ? process_measurement+0xc0/0xd7 > [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef > [<ffffffff81113098>] ? search_binary_handler+0xc8/0x255 > [<ffffffff81114366>] ? do_execve+0x1c3/0x29e > [<ffffffff8101155d>] ? sys_execve+0x43/0x5d > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f > [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f > [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e > [<ffffffff81013daa>] ? child_rip+0xa/0x20 > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f > [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b > [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 > [<ffffffff81013da0>] ? c > hild_rip+0x0/0x20 > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 > RIP [<ffffffff8103a3cb>] leave_mm+0x15/0x46 > RSP <ffff88002803ee48> > ---[ end trace 1522f17fdfc9162d ]--- > Kernel panic - not syncing: Fatal exception in interrupt > Pid: 8022, comm: khelper Tainted: G D 2.6.32.36xen #1 > Call Trace: > <IRQ> [<ffffffff8105682e>] panic+0xe0/0x19a > [<ffffffff8144006a>] ? init_amd+0x296/0x37a Hmmm... both machines are using AMD CPU? Did you hit the same bug on Intel CPU? > [<ffffffff8100f169>] ? xen_force_evtchn_callback+0xd/0xf > [<ffffffff8100f8c2>] ? check_events+0x12/0x20 > [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff81056487>] ? print_oops_end_marker+0x23/0x25 > [<ffffffff81448165>] oops_end+0xb6/0xc6 > [<ffffffff810166e5>] die+0x5a/0x63 > [<ffffffff81447a3c>] do_trap+0x115/0x124 > [<ffffffff810148e6>] do_invalid_op+0x9c/0xa5 > [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46 > [<ffffffff8100f6e6>] ? xen_clocksource_read+0x21/0x23 > [<ffffffff8100f258>] ? HYPERVISOR_vcpu_op+0xf/0x11 > [<ffffffff8100f753>] ? xen_vcpuop_set_next_event+0x52/0x67 > [<ffffffff81013b3b>] invalid_op+0x1b/0x20 > [<ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17 > [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46 > [<ffffffff8100e4a4>] drop_other_mm_ref+0x2a/0x53 > [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc > [<ffffffff810100e8>] xen_call_function_single_interrupt+0x13/0x28 > [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 > [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e > [<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1ab/0x27d > [<ffffffff8128dcf9>] xen_evtchn_do_upcall+0x33/0x46 > [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30 > <EOI> [<ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17 > [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff81113f75>] ? flush_old_exec+0x3ac/0x500 > [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef > [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef > [<ffffffff81151161>] ? load_elf_binary+0x398/0x17ef > [<ffffffff81042fcf>] ? need_resched+0x23/0x > 2d > [<ffffffff811f463c>] ? process_measurement+0xc0/0xd7 > [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef > [<ffffffff81113098>] ? search_binary_handler+0xc8/0x255 > [<ffffffff81114366>] ? do_execve+0x1c3/0x29e > [<ffffffff8101155d>] ? sys_execve+0x43/0x5d > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f > [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f > [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e > [<ffffffff81013daa>] ? child_rip+0xa/0x20 > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f > [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b > [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 > [<ffffffff81013da0>] ? child_rip+0x0/0x20 > (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. > >> Date: Tue, 12 Apr 2011 06:00:00 -0400 >> From: konrad.wilk@xxxxxxxxxx >> To: tinnycloud@xxxxxxxxxxx >> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; giamteckchoon@xxxxxxxxx; >> jeremy@xxxxxxxx >> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >> >> On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote: >> > >> > Hi : >> > >> > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kernel >> > panic bug. >> > >> > 2.6.32.36 Kernel: >> > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=bb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 >> > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba183 >> > >> > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loopes >> > in restart every 15minutes. >> >> What is the storage that you are using for your guests? AoE? Local disks? >> >> > About 17 machines are invovled in the test, after 10 hours run, one >> > confrontted a crash at arch/x86/mm/tlb.c:61 >> > >> > Currently I am trying "cpuidle=0 cpufreq=none" tests based on Teck's >> > suggestion. >> > >> > Any comments, thanks. >> > >> > ===============crash log========================== >> > INIT: Id "s0" respawning too fast: disabled for 5 minutes >> > __ratelimit: 14 callbacks suppressed >> > blktap_sysfs_destroy >> > blktap_sysfs_destroy >> > ------------[ cut here ]------------ >> > kernel BUG at arch/x86/mm/tlb.c:61! >> > invalid opcode: 0000 [#1] SMP >> > last sysfs file: >> > /sys/devices/system/xen_memory/xen_memory0/info/current_kb >> > CPU 1 >> > Modules linked in: 8021q garp xen_netback xen_blkback blktap >> > blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si >> > ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output >> > sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss >> > snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss >> > snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc >> > i2c_i801 iTCO_vendor_support i2c_core pcspkr pata_acpi ata_generic ata_piix >> > shpchp mptsas mptscsih mptbase [last unloaded: freq_table] >> > Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 >> > RIP: e030:[<ffffffff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46 >> > RSP: e02b:ffff88002805be48 EFLAGS: 00010046 >> > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 >> > RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 >> > RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 >> > R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 >> > R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 >> > FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) >> > knlGS:0000000000000000 >> > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >> > CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 >> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> > Process khelper (pid: 25581, threadinfo ffff88007691e000, task >> > ffff88009b92db40) >> > Stack: >> > ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 >> > <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 >> > <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 >> > Call Trace: >> > <IRQ> >> > [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53 >> > [<ffffffff81087224>] >> > generic_smp_call_function_single_interrupt+0xd8/0xfc >> > [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28 >> > [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 >> > [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e >> > [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d >> > [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46 >> > [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30 >> > <EOI> >> > [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17 >> > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 >> > [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500 >> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef >> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef >> > [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef >> > [<ffffffff81042fcf>] ? need_resched+0x23/0x2d >> > [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7 >> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef >> > [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255 >> > [<ffffffff81114362>] ? do_execve+0x1c3/0x29e >> > [<ffffffff8101155d>] ? sys_execve+0x43/0x5d >> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f >> > [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 >> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f >> > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 >> > [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e >> > [<ffffffff81013daa>] ? child_rip+0xa/0x20 >> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f >> > [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b >> > [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 >> > [<ffffffff81013da0>] ? child_rip+0x0/0x20 >> > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 >> > 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb >> > fe >> > 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 >> > RIP [<ffffffff8103a3cb>] leave_mm+0x15/0x46 >> > RSP <ffff88002805be48> >> > ---[ end trace ce9cee6832a9c503 ]--- >> > Kernel panic - not syncing: Fatal exception in interrupt >> > Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1 >> > Call Trace: >> > <IRQ> [<ffffffff8105682e>] panic+0xe0/0x19a >> > [<ffffffff8144008a>] ? init_amd+0x296/0x37a >> > [<ffffffff8100f17d>] ? xen_force_evtchn_callback+0xd/0xf >> > [<ffffffff8100f8e2>] ? check_events+0x12/0x20 >> > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 >> > [<ffffffff81056487>] ? print_oops_end_marker+0x23/0x25 >> > [<ffffffff81448185>] oops_end+0xb6/0xc6 >> > [<ffffffff810166e5>] die+0x5a/0x63 >> > [<ffffffff81447a5c>] do_trap+0x115/0x124 >> > [<ffffffff810148e6>] do_invalid_op+0x9c/0xa5 >> > [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46 >> > [<ffffffff8100f6fa>] ? xen_clocksource_read+0x21/0x23 >> > [<ffffffff8100f26c>] ? HYPERVISOR_vcpu_op+0xf/0x11 >> > [<ffffffff8100f767>] ? xen_vcpuop_set_next_event+0x52/0x67 >> > [<ffffffff81080bfa>] ? clockevents_program_event+0x78/0x81 >> > [<ffffffff81013b3b>] invalid_op+0x1b/0x20 >> > [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17 >> > [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46 >> > [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53 >> > [<ffffffff81087224>] >> > generic_smp_call_function_single_interrupt+0xd8/0xfc >> > [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28 >> > [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120 >> > [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e >> > [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d >> > [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46 >> > [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30 >> > <EOI> [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17 >> > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 >> > [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500 >> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef >> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef >> > [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef >> > [<ffffffff81042fcf>] ? need_resched+0x23/0x2d >> > [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7 >> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef >> > [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255 >> > [<ffffffff81114362>] ? do_execve+0x1c3/0x29e >> > [<ffffffff8101155d>] ? sys_execve+0x43/0x5d >> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f >> > [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0 >> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f >> > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1 >> > [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e >> > [<ffffffff81013daa>] ? child_rip+0xa/0x20 >> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f >> > [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b >> > [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6 >> > [<ffffffff81013da0>] ? child_rip+0x0/0x20 >> > >> > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |