[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] v3.10-rc0 regressions. HELP!



I am not able to see these with v3.9 but with v3.10 I can easily seem them.

And I can only see them when I build the kernel with these options:

CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y

Attached is the full serial log, but here are the excerpts:

(XEN) HVM1: 130MB medium detected
(XEN) HVM1: Booting from 0000:7c00
[  182.836965] BUG: scheduling while atomic: qemu-dm/3621/0x00000101
[  182.863930] no locks held by qemu-dm/3621.
[  182.888475] Modules linked in: dm_multipath dm_mod xen_evtchn 
iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c 
crc32c nouveau mxm_wmi radeon ttm sg sr_mod sd_mod cdrom ahci libahci mperf 
crc32c_intel libata scsi_mod fbcon tilebli xen_blkfront xen_netfront 
fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd
[  183.012005] CPU: 0 PID: 3621 Comm: qemu-dm Not tainted 
3.9.0upstream-10936-g51a26ae #1
[  183.042583] Hardware name: LENOVO ThinkServer TS130/        , BIOS 9HKT47AUS 
01/10/2012
[  183.073531]  0000000000000000 ffff88007fa03c38 ffffffff8169d092 
ffff88007fa03c58
[  183.104037]  ffffffff810c23d5 ffff88007fa14b00 ffff88007fa14b00 
ffff88007fa03ce8
[  183.134392]  ffffffff8169f16f 000000010e4341c0 ffff880012405fd8 
ffff880012404000
[  183.164498] Call Trace:
[  183.189376]  <IRQ>  [<ffffffff8169d092>] dump_stack+0x19/0x1b
[  183.217888]  [<ffffffff810c23d5>] __schedule_bug+0x65/0x90
[  183.246280]  [<ffffffff8169f16f>] __schedule+0x81f/0x840
[  183.274147]  [<ffffffff8169f254>] schedule+0x24/0x70
[  183.301306]  [<ffffffff8169dfb0>] schedule_hrtimeout_range_clock+0xc0/0x160
[  183.330515]  [<ffffffff810b98f0>] ? update_rmtp+0x80/0x80
[  183.357663]  [<ffffffff810baaff>] ? hrtimer_start_range_ns+0xf/0x20
[  183.385601]  [<ffffffff8169e05e>] schedule_hrtimeout_range+0xe/0x10
[  183.413258]  [<ffffffff8109e18b>] usleep_range+0x3b/0x40
[  183.439494]  [<ffffffffa007fc6d>] e1000_irq_enable+0x1ad/0x1e0 [e1000e]
[  183.467222]  [<ffffffffa007fe18>] e1000e_poll+0x178/0x2e0 [e1000e]
[  183.494288]  [<ffffffff81540b78>] ? net_rx_action+0xd8/0x280
[  183.520433]  [<ffffffff81540bd5>] net_rx_action+0x135/0x280
[  183.546316]  [<ffffffff81096bd9>] __do_softirq+0x119/0x2d0
[  183.571792]  [<ffffffff81096efd>] irq_exit+0xed/0x100
[  183.596388]  [<ffffffff813b742f>] xen_evtchn_do_upcall+0x2f/0x40
[  183.621833]  [<ffffffff816aac1e>] xen_do_hypervisor_callback+0x1e/0x30
[  183.647781]  <EOI>  [<ffffffff8100122a>] ? xen_hypercall_xen_version+0xa/0x20
[  183.674269]  [<ffffffff8100122a>] ? xen_hypercall_xen_version+0xa/0x20
[  183.699930]  [<ffffffff810420ed>] ? xen_force_evtchn_callback+0xd/0x10
[  183.725964]  [<ffffffff81042a22>] ? check_events+0x12/0x20
[  183.750676]  [<ffffffff810429c9>] ? xen_irq_enable_direct_rel[  183.776451]  
[<ffffffff816a970c>] ? system_call_after_swapgs+0x19/0x60
[  183.802194] NOHZ: local_softirq_pending 282
[  183.827712] sh (3751) used greatest stack depth: 2344 [  184.035913] BUG: 
scheduling while atomic: qemu-dm/3621/0x00000101
[  184.035916] BUG: scheduling while atomic: sshd/3582/0x00000604
[  184.035918] 7 locks held by sshd/3582:
[  184.035924]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff8159de57>] 
tcp_sendmsg[  184.035927]  #1:  (rcu_read_lock){.+.+..}, at: 
[<ffffffff815916d0>] ip_queue_xmit+0x0/0x510
[  184.035930]  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81590ecb>] 
ip_finish_output2+0x7b/0x3e0
[  184.035933]  #3:  (r..}, at: [<ffffffff815418b0>] dev_queue_xmit+0x0/0x690
[  184.035937]  #4:  (rcu_read_lock){.+.+..}, at: [<ffffffff81649640>] 
br_dev_xmit+0x0/0x1b0
[  184.035939]  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815418b0>] 
dev_queue_xmit+0x0/0x690
[  184.035943]  #6:  (_xmit_ETHER#2){+.-...}, at: [<ffffffff815607b7>] 
sch_direct_xmit+0xb7/0x280

And so on. It keeps on happening when QEMU runs and at some point the kernel
crashes due to corruption:

[  204.049337]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff811ca4fb>] 
fget_light+0x3b/0x150
[  204.072019] BUG: unable to handle kernel paging request at 00000002e66c9780
[  204.093663] IP: [<ffffffff810bed42>] task_curr+0x12/0x30
[  204.113615] PGD 69dac067 PUD 0 
[  204.131150] Thread overran stack, or stack corrupted
[  204.150870] Oops: 0000 [#1] SMP 
[  204.168495] Modules linked in: dm_multipath dm_mod xen_evtchn 
iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transportd_mod cdrom ahci 
libahci mperf crc32c_intel libata scsi_mod fbcon tileblit font bitblit i915 
softcursor e1000e drm_kms_helper video tpm_tis wmi xen_blkfront xen_netfront 
fb_sys_fops sysimgblt sysfillrec syscopyarea xenfs xen_privcmd
[  204.270133] CPU: 0 PID: 3621 Comm: qemu-dm Tainted: G        W    
3.9.0upstream-10936-g51a26ae #1
[  204.296935] Hardware name: LENOVO ThinkServer TS130/        , BIOS 9HKT47AUS 
01/10/2012
[  204.323074] task: ffff88006c942200 ti: ffff880012404000 task.ti: 
ffff880012404000
[  204.348978] RIP: e030:[<ffffffff810bed[  204.375336] RSP: 
e02b:ffff880012404240  EFLAGS: 00010046
[  204.399162] RAX: 0000000000014b00 RBX: ffff88006c942200 RCX: 000000000000000d
[  204.425264] RDX: 000000006c942200 RSI: ffff88006c942200 RDI: ffff88006c942200
[  204.451354] RBP: ffff880012404240 R08: ffced0004a5b006a R09: 0000000000000001
[  204.477580] R10: 0000000000000001 R11: 0000000000000001 R12: 000000000000000e
[  204.503954] R13: 000000000000000e R14: 0000000000000001 R15: ffff88001268e0c0
[  204.530235] FS:  00007f983fc23700(0000) GS:ffff88007fa00000(0000) 
knlGS:0000000000000000
[  204.557773] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[  204.582781] CR2: 00000002e66c9780 CR3: 000000006ba6e000 CR4: 0000000000042660
[  204.609561] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  204.636335] DR3: 0000000000000000 DR6: 00000ffff0ff0 DR7: 0000000000000400
[  204.662967] Stack:
[  204.683881]  ffff8800124042a0 ffffffff810a2766 ffffffff00000001 
00ff88000000000d
[  204.710994]  0000000000000000 ffff88006
[  204.738283]  ffff88006c942200 000000000000000e 0000000000000001 
0000000000000000
[  204.765580] Call Trace:
[  204.787773]  [<ffffffff810a2766>] complete_signal+0x146/0x220
[  204.813972]  [<ffffffff810a5c1b>] send_sigqueue+0xcb/0x1e0
[  204.839795]  [<ffffffff810b513f>] posix_timer_event+0x7f/0xc0
[  204.865905]  [<ffffffff810b50c0>] ? posix_timers_register_clock+0xe0/0xe0
[  204.893286]  [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[  204.919256]  [<ffffffff810b51d6>] posix_timer_fn+0x56/0xe0
[  204.945028]  [<ffffffff810b9f9f>] __run_hrtimer+0x6f/0x220
[  204.970792]  [<ffffffff810b5180>] ? posix_timer_event+0xc0/0xc0
[  204.997134]  [<ffffffff810ba42e>] hrtimer_interrupt+0x10e/0x290
[  205.023520]  [<ffffffff8104261f>] xen_timer_interrupt+0x2f/0x1b0
[  205.049996]  [<ffffffff8111da7c>] handle_irq_event_percpu+0x7c/0x240
[  205.077010]  [<ffffffff81120cd9>] handle_percpu_irq+0x49/0x70
[  205.103378]  [<ffffffff813b73dd>] __xen_evtchn_do_upcall+0x38d/0x3a0
[  205.130601]  [<ffffffff810e998d>] ? trace_hardirqs_off+0xd/0x10
[  205.157356]  [<ffffffff810c8b37>] ? irqtime_account_irq+0xe7/0x100
[  205.184300]  [<ffffffff813b742a>] xen_evtchn_do_upcall+0x2a/0x40
[  205.211032]  [<ffffffff816aac1e>] xen_do_hypervisor_callback+0x1e/0x30
[  205.238212]  [<ffffffff8100122a>] ? xen_hypercall_xen_version+0xa/0x20
[  205.265131]  [<ffffffff8100122a>] ? xen_hypercall_xen_version+0xa/0x20
[  205.291516]  [<ffffffff810420ed>] ? xen_force_evtchn_callback+0xd/0x10
[  205.317653]  [<ffffffff81042a22>] ? check_events+0x12/0x20
[  205.342515]  [<ffffffff81042a0f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[  205.368739]  [<ffffffff81090ac1>] ? vprintk_emit+0x251/0x520
[  205.393899]  [<ffffffff81042a01>] ? xen_restore_fl_direct+0
[  205.419209]  [<ffffffff8169cf43>] ? printk+0x48/0x4a
[  205.442402]  [<ffffffff811ca4fb>] ? fget_light+0x3b/0x150
[  205.465431]  [<ffffffff811ca4fb>] ? fget_light+0x3b/0x150
[  205.487665]  [<ffffffff810eb465>] ? print_lock+0x55/0xb0
[  205.509345]  [<ffffffff810eb53f>] ? lockdep_print_held_locks+0x[  
205.532297]  [<ffffffff810eb795>] ? debug_show_held_locks+0x15/0x30
[  205.554631]  [<ffffffff810c23bf>] ? __schedule_bug+0x4f/0x90
[  205.575915]  [<ffffffff8169f16f>] ? __schedule+0x81f/0x840
[  205.596715]  [<ffffffff8169f254>] ? schedule+0x24/0x70
[  205.616849]  [<ffffffff8169dfb0>] ? schedule_hrtimeout_range_clock+0xc0/0x160
[  205.639418]  [<ffffffff810b98f0>] ? update_rmtp+0x80/0x80
[  205.660007]  [<ffffffff810baaff>] ? hrtimer_start_range_ns+0xf/0x20
[  205.681604]  [<ffffffff8169e05e>] ? schedule_hrtimeout_range+0xe/0x10
[  205.703486]  [<ffffffff8109e18b>] ? usleep_range+0[  205.724082]  
[<ffffffffa007baf5>] ? e1000e_update_tdt_wa+0x55/0xe0 [e1000e]
[  205.746269]  [<ffffffffa007cc28>] ? e1000_xm[  205.768163]  
[<ffffffff8153e8f2>] ? dev_queue_xmit_nit+0x202/0x280
[  205.789490]  [<ffffffff8153e6f0>] ? net_tx_action+0x2[  205.810349]  
[<ffffffff8153ec78>] ? dev_hard_start_xmit+0x308/0x5a0
[  205.831671]  [<ffffffff815607fe>] ? sch_direct_xmit+[  205.852275]  
[<ffffffff81541a39>] ? dev_queue_xmit+0x189/0x690
[  205.872734]  [<ffffffff815418b0>] ? dev_loopback_xmit+0x1e0/0x1e0
[  205.893410]  [<ffffffff8164b0b5>] ? br_dev_queue_push_xmit+0x55/0x70
[  205.914143]  [<ffffffff8164b20d>] ? br_forward_finish+0x1d/0x60
[  205.934283]  [<ffffffff81649640>] ? br_netpoll_setup+0x90/0x90
[  205.954247]  [<ffffffff8164b290>] ? __br_deliver+0x40/0x1
[  205.973967]  [<ffffffff8164b3cd>] ? br_deliver+0x3d/0x50
[  205.993208]  [<ffffffff816497ce>] ? br_dev_xmit+0x18e/0x1b0
[  206.012589]  [<ffffffff81649640>] ? br_netpoll_setup+0x90/0x90
[  206.032322]  [<ffffffff8153ec78>] ? dev_hard_start_xmit+0[  206.052573]  
[<ffffffff81541b87>] ? dev_queue_xmit+0x2d7/0x690
[  206.072298]  [<ffffffff815418b0>] ? dev_loopback_xmit+0x1e0/0x1e0
[  206.092421]  [<ffffffff81591020>] ? ip_finish_output2+0x1d0/0x3e0
[  206.112370]  [<ffffffff81590ecb>] ? ip_finish_output2+[  206.132015]  
[<ffffffff8156dc04>] ? nf_hook_slow+0x134/0x190
[  206.151307]  [<ffffffff81592890>] ? ip_fragment+0x8a0/0x8a0[  206.170420]  
[<ffffffff8159293e>] ? ip_finish_output+0xae/0x200
[  206.189796]  [<ffffffff81592ae4>] ? ip_output+0x54/0xe0[  206.208336]  
[<ffffffff81591258>] ? ip_local_out+0x28/0x80
[  206.227053]  [<ffffffff8159185b>] ? ip_queue_xmit+0x18b/0x510[  206.245981]  
[<ffffffff815916d0>] ? ip_send_unicast_reply+0x390/0x390
[  206.265688]  [<ffffffff815a83c5>] ? tcp_transmit_skb+0x465/0x880
[  206.285137]  [<ffffffff815a96fc>] ? tcp_send_ack+0xec/0x120
[  206.304062]  [<ffffffff815a0c09>] ? __tcp_ack_snd_check+0x59[  206.323558]  
[<ffffffff815a737c>] ? tcp_rcv_established+0x22c/0x810
[  206.343227]  [<ffffffff815b23ec>] ? tcp_v4_do_rcv+0x[  206.362232]  
[<ffffffff815b3011>] ? tcp_v4_rcv+0x5e1/0x7f0
[  206.380800]  [<ffffffff810ef0b0>] ? lock_acquire+0xb0/0x120
[  206.399462]  [<ffffffff8158b753>] ? ip_local_deliver_finish+0x43/0x350
[  206.419330]  [<ffffffff8158b710>] ? ip_local_deliver+0x80/0x80
[  206.438526]  [<ffffffff8158b808>] ? ip_local_deliver_finish+0xf8/0x350
[  206.458487]  [<ffffffff8158b753>] ? ip_local_deliver_finish+0x43/0x350
[  206.478270]  [<ffffffff8158b6d2>] ? ip_local_deliver+0x42/0x80
[  206.497093]  [<ffffffff8158bbec>] ? ip_rcv_finish+0x18c/0x4b0
[  206.515770]  [<ffffffff8158b599>] ? ip_rcv+0x219/0x310
[  206.533777]  [<ffffffff8153ff1a>] ? __netif_receive_skb_core+0x6ca/0x850
[  206.553680]  [<ffffffff8153f951>] ? __netif_receive_skb_core+0x101/0x850
[  206.573385]  [<ffffffff815400bd>] ? __netif_receive_skb+0x1d/0x70
[  206.592123]  [<ffffffff81540310>] ? netif_receive_skb+[  206.610742]  
[<ffffffff8164c2cd>] ? br_handle_frame_finish+0x1cd/0x2c0
[  206.629932]  [<ffffffff810ef0b0>] ? lock_acquire+[  206.648109]  
[<ffffffff8164c01a>] ? br_handle_frame+0x1aa/0x290
[  206.666609]  [<ffffffff8164be70>] ? br_handle_local_finish+0x40/0x40
[  206.685485]  [<ffffffff8153fb49>] ? __netif_receive_skb_core+0x2f9/0x850
[  206.704771]  [<ffffffff8153f951>] ? __netif_rec[  206.723887]  
[<ffffffff815c65b0>] ? inet_gso_send_check+0x160/0x160
[  206.742506]  [<ffffffff815400bd>] ? __netif_receive_[  206.761026]  
[<ffffffff81540310>] ? netif_receive_skb+0x20/0x120
[  206.779387]  [<ffffffff815c66a3>] ? inet_gro_complete+0xf3/0x140
[  206.797749]  [<ffffffff815c65b0>] ? inet_gso_send_check+0x160/0x160
[  206.816318]  [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[  206.834163]  [<ffffffff8154052c>] ? napi_gro_complete+0x11c/
[  206.852682]  [<ffffffff81540430>] ? napi_gro_complete+0x20/0x140
[  206.871004]  [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[  206.888823]  [<ffffffff81540826>] ? dev_gro_receive+0x2d6/0x430
[  206.907056]  [<ffffffff81540748>] ? dev_gro_receive+0x1f8/0x430
[  206.925189]  [<ffffffff8119d993>] ? kmem_cache_free+0x123/0x370
[  206.943303]  [<ffffffff810ed400>] ? trace_hardirqs_on_ca[  206.962279]  
[<ffffffff81541026>] ? napi_gro_receive+0x56/0x150
[  206.980487]  [<ffffffffa007a8a5>] ? e1000_receive_skb+0x75/0xf0 [e1000e]
[  206.999791]  [<ffffffffa007d7a8>] ? e1000_clean_rx_irq+0x298/0x4a0 [e1000e]
[  207.019536]  [<ffffffffa007fd28>] ? e1000e_poll+0x88/0x2e0 [e1000e]
[  207.038513]  [<ffffffff81540b78>] ? net_rx_action+0xd8/0x280
[  207.056877]  [<ffffffff81540bd5>] ? net_rx_action+0x135/0x280
[  207.075231]  [<ffffffff81096bd9>] ? __do_softirq+0x119/0x2d0
[  207.093466]  [<ffffffff81096efd>] ? irq_exit+0xed/0x100
[  207.111271]  [<ffffffff813b742f>] ? xen_evtchn_do_upcall+0x2f/0x[  
207.130320]  [<ffffffff816aac1e>] ? xen_do_hypervisor_callback+0x1e/0x30
[  207.149980]  [<ffffffff811ca561>] ? fget_light+[  207.168130]  
[<ffffffff811ca531>] ? fget_light+0x71/0x150
[  207.186070]  [<ffffffff811ca4fb>] ? fget_light+0x3b/0x150
[  207.203830]  [<ffffffff811c067e>] ? do_select+0x36e/0x6e0
[  207.221524]  [<ffffffff811c0310>] ? select_estimate_accuracy+007.240538]  
[<ffffffff811c00d0>] ? poll_freewait+0x90/0x90
[  207.258534]  [<ffffffff811c01c0>] ? __pollwait+0xf0/0xf0
[  207.276217]  [<ffffffff815392bd>] ? net_rps_action_and_irq_enable+0x8d/0xa0
[  207.295849]  [<ffffffff8111d882>] ? __irq_get_desc_lock+0x62/0xb0
[  207.314629]  [<ffffffff810c122d>] ? __wake_up+0x2d/0x7
[  207.332497]  [<ffffffff810edfde>] ? __lock_acquire+0x7be/0x17e0
[  207.351329]  [<ffffffff810eda39>] ? __lock_acquire+0x219[  207.370039]  
[<ffffffff810ef1c8>] ? lock_release_non_nested+0xa8/0x360
[  207.389539]  [<ffffffff8131668e>] ? do_raw_spin_u[  207.408549]  
[<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[  207.426757]  [<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[  207.444973]  [<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[  207.463071]  [<ffffffff810ef570>] ? lock_release+0xf0/0x250
[  207.481346]  [<ffffffff811c143c>] ? core_sys_select+0x21c/0x350
[  207.499957]  [<ffffffff811c1268>] ? core_sys_select+0x48/0x350
[  207.518281]  [<ffffffff810427d9>] ? xen_clocksource_read+0x39/0x50
[  207.536978]  [<ffffffff810ef1c8>] ? lock_release_non_nested+0xa8/0x360
[  207.556248]  [<ffffffff810e998d>] ? trace_hardirqs_off+0xd/0x10
[  207.574735]  [<ffffffff81125367>] ? rcu_irq_exit+0x87/0xe0
[  207.592900]  [<ffffffff81178b6e>] ? might_fault+0x4e/0xa0
[  207.610858]  [<ffffffff810427d9>] ? xen_clocksource_read+0x39/0x50
[  207.629792]  [<ffffffff810429a9>] ? xen_clocksource_g[  207.649195]  
[<ffffffff810dfa77>] ? ktime_get_ts+0x47/0xf0
[  207.667330]  [<ffffffff811c17d2>] ? SyS_select+0x42/0x110
[  207.685402]  [<ffffffff816a9769>] ? system_call_fastpath+0x16/0x1b
[  207.704362] Code: 5f c9 c3 66 0f 1f 44 00 00 55 31 c000 48 89 e5 8b 52 18 
<48> 8b 14 d5 80 87 cb 81 48 39 bc 10 98 08 00 00 c9 0f 94 c0 0f 
[  207.749634] RIP  [<ffffffff810bed42>[  207.768941]  RSP <ffff880012404240>
[  207.786147] CR2: 00000002e66c9780
[  207.803299] ---[ end trace f347e5b235e48095 ]---
[  207.821933] Kernel panic - not syncing: Fatal exception in interrupt
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.


If anybody has some time to do a bit of git bisect to help identify the culprit
it would be very much welcomed.

Attachment: tst035
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.