[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-users] BUG: soft lockup
- To: Dana Rawding <dana@xxxxxxxxxxx>
- From: alex <alex.faq8@xxxxxxxxx>
- Date: Tue, 2 Feb 2010 23:58:07 +0300
- Cc: Xen List <xen-users@xxxxxxxxxxxxxxxxxxx>
- Delivery-date: Tue, 02 Feb 2010 12:59:32 -0800
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=txBjvpSjmfajKMtClWMouDGVv6LkHaZ28vFgmc8httpCcemwPIF9NQLvIUoSioxuqI sEMyuM7t2C9sr6jyh/e9JhyFwQ+RM7m43cHgAKKxArz1buYcEIvffiVJ/1R6oYsRO29j qDc5Ty095zS7/HS1dFExQhXSaIiYcSR11ztSk=
- List-id: Xen user discussion <xen-users.lists.xensource.com>
I have this problem too. Xen 3.3.1 Debian Lenny. LA on server up
to 10-15, all domUs freeze and I can't do anything. Please test I fix
this problem by xm sched-credit -d 0 -w 512 .
[787717.425090]
BUG: soft lockup - CPU#0 stuck for 61s! [watchdog/0:5]
[787717.425090] Modules linked in: xt_tcpudp xt_physdev iptable_filter
ip_tables x_tables tun bridge ipv6 nfsd auth_rpcgss exportfs nfs lockd
nfs_acl sunrpc loop joydev igb psmouse pcspkr i2c_i801 serio_raw button
i2c_core evdev dca ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod
sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock
sd_mod ide_pci_generic ide_core ehci_hcd uhci_hcd 3w_9xxx scsi_mod
thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[787717.432148] CPU 0:
[787717.432148] Modules linked in: xt_tcpudp xt_physdev iptable_filter
ip_tables x_tables tun bridge ipv6 nfsd auth_rpcgss exportfs nfs lockd
nfs_acl sunrpc loop joydev igb psmouse pcspkr i2c_i801 serio_raw button
i2c_core evdev dca ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod
sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock
sd_mod ide_pci_generic ide_core ehci_hcd uhci_hcd 3w_9xxx scsi_mod
thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[787717.436173] Pid: 5, comm: watchdog/0 Not tainted 2.6.26-1-xen-amd64
#1
[787717.436173] RIP: e030:[<ffffffff8025ed13>]
[<ffffffff8025ed13>] watchdog+0xbe/0x1cf
[787717.436173] RSP: e02b:ffff880bce0d9ef0 EFLAGS: 00000207
[787717.436173] RAX: 0000000000000001 RBX: ffff880bcb4e5400 RCX:
0002cc64939f91fe
[787717.436173] RDX: ffff880081656000 RSI: ffffffff804fe460 RDI:
ffffffff8053a000
[787717.436173] RBP: ffff880bcb4e5400 R08: ffff880001be3040 R09:
ffff880bce0d9e30
[787717.436173] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000399
[787717.436173] R13: 00000000000b3192 R14: 0000000000000000 R15:
0000000000000000
[787717.436173] FS: 00007f0cfbb3e6e0(0000) GS:ffffffff80539000(0000)
knlGS:0000000000000000
[787717.436173] CS: e033 DS: 0000 ES: 0000
[787717.436173] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[787717.436173] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[787717.436173]
[787717.436173] Call Trace:
[787717.436173] [<ffffffff8025ec55>] ? watchdog+0x0/0x1cf
[787717.436173] [<ffffffff8023f56b>] ? kthread+0x47/0x74
[787717.436173] [<ffffffff8022839f>] ? schedule_tail+0x27/0x5c
[787717.436173] [<ffffffff8020be28>] ? child_rip+0xa/0x12
[787717.436173] [<ffffffff8023f524>] ? kthread+0x0/0x74
[787717.436173] [<ffffffff8020be1e>] ? child_rip+0x0/0x12
[787717.436173]
I fix this problem by xm sched-credit -d 0 -w 512 .
2010/1/31 Dana Rawding <dana@xxxxxxxxxxx>
Hi all,
I've been experiencing a rash of CPU lockups on a number of domU's recently. It's been happening on two different servers. About a year ago I had this problem every once in a while but it was not frequent. I was running Ubuntu with Xen 3.1 and 2.6.24-18 back then. I'm now running Xen 3.3 and 2.6.24-26.
What I have noticed is that just prior to the lockups the domU's had high cpu loads. The domU that I have the most problems with is a Zimbra server. My guess is that a rash of spam comes through and cpu loads get high, then the cpu's lock up. Originally I had it running with 1 cpu but have since upped it 2 then 3 cpu's.
I have been collecting the lockup messages and have posed a few below. Any ideas? Recommendations?
Thanks,
Dana
[138077.172283] =======================
[138075.147398] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:97]
[138075.147411]
[138075.147419] Pid: 97, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1)
[138075.147426] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 0
[138075.147441] EIP is at _spin_lock+0x7/0x10
[138075.147447] EAX: c1da48ec EBX: 00000000 ECX: 220c7000 EDX: 00000000
[138075.147453] ESI: 8b804067 EDI: c1da48ec EBP: 00000f28 ESP: ed707dec
[138075.147459] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[138075.147471] CR0: 8005003b CR2: 080f0010 CR3: 2213b000 CR4: 00000660
[138075.147482] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[138075.147488] DR6: ffff0ff0 DR7: 00000400
[138075.147495] [<c01773cb>] page_check_address+0x1cb/0x3c0
[138075.147514] [<c0119868>] xen_invlpg_mask+0x38/0x40
[138075.147529] [<c017762e>] page_referenced_one+0x6e/0x190
[138075.147541] [<c017875c>] page_referenced+0xec/0x130
[138075.147552] [<c01671cf>] shrink_active_list+0x18f/0x5c0
[138075.147567] [<c016826d>] shrink_zone+0xdd/0x100
[138075.147578] [<c01688cc>] kswapd+0x44c/0x490
[138075.147589] [<c013bb00>] autoremove_wake_function+0x0/0x40
[138075.147603] [<c011e270>] complete+0x40/0x60
[138075.147614] [<c0168480>] kswapd+0x0/0x490
[138075.147625] [<c013b842>] kthread+0x42/0x70
[138075.147635] [<c013b800>] kthread+0x0/0x70
[138075.147646] [<c0105bb7>] kernel_thread_helper+0x7/0x10
[138075.147658] =======================
[138088.987826] BUG: soft lockup - CPU#1 stuck for 11s! [java:23215]
[138088.987841]
[138088.987846] Pid: 23215, comm: java Tainted: G D (2.6.24-26-xen #1)
[138088.987850] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1
[138088.987862] EIP is at _spin_lock+0x7/0x10
[138088.987866] EAX: c1da48ec EBX: 00000000 ECX: c1da48e0 EDX: 00000ca8
[138088.987870] ESI: 8b804067 EDI: 00000000 EBP: e20c7ca8 ESP: e226be04
[138088.987873] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[138088.987883] CR0: 80050033 CR2: 940ef020 CR3: 2211f000 CR4: 00000660
[138088.987891] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[138088.987896] DR6: ffff0ff0 DR7: 00000400
[138088.987901] [<c016d88d>] unmap_vmas+0x43d/0xae0
[138088.987922] [<c011959c>] kmap_atomic+0x1c/0x30
[138088.987941] [<c01192fd>] kunmap_atomic+0x3d/0x60
[138088.987957] [<c0173ee8>] vma_adjust+0x1c8/0x440
[138088.987967] [<c0173765>] unmap_region+0x95/0x120
[138088.987975] [<c0174387>] do_munmap+0x147/0x1f0
[138088.987983] [<c0174c90>] mmap_region+0x70/0x450
[138088.987991] [<c01db3b7>] security_file_mmap+0x27/0x30
[138088.988001] [<c0175472>] do_mmap_pgoff+0x312/0x330
[138088.988008] [<c010a02b>] sys_mmap2+0xbb/0xd0
[138088.988016] [<c0105832>] syscall_call+0x7/0xb
[138088.988023] [<c0320000>] svc_accept+0x150/0x410
[138088.988032] =======================
[66916.451144] BUG: soft lockup - CPU#0 stuck for 11s! [java:2758]
[66928.193453] BUG: soft lockup - CPU#1 stuck for 11s! [java:3419]
[336990.703192] BUG: soft lockup - CPU#1 stuck for 11s! [ps:32586]
[336990.703206]
[336990.703214] Pid: 32586, comm: ps Tainted: G D (2.6.24-26-xen #1)
[336990.703221] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1
[336990.703235] EIP is at _spin_lock+0x7/0x10
[336990.703241] EAX: c1dbc72c EBX: 00000000 ECX: c1dbc720 EDX: 00000007
[336990.703247] ESI: 57b51067 EDI: 00000001 EBP: e2cb93c8 ESP: e2033e4c
[336990.703253] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[336990.703266] CR0: 80050033 CR2: 08079004 CR3: 23651000 CR4: 00000660
[336990.703275] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[336990.703282] DR6: ffff0ff0 DR7: 00000400
[336990.703288] [<c0171646>] handle_mm_fault+0xae6/0x1360
[336990.703307] [<c020e057>] rb_insert_color+0x77/0xe0
[336990.703325] [<c032a27e>] do_page_fault+0x35e/0xe70
[336990.703337] [<c01745d4>] vma_merge+0x144/0x1d0
[336990.703349] [<c0174b75>] do_brk+0x195/0x240
[336990.703362] [<c0175126>] sys_brk+0xb6/0xf0
[336990.703374] [<c0329f20>] do_page_fault+0x0/0xe70
[336990.703384] [<c0328bc5>] error_code+0x35/0x40
[336990.703396] =======================
[337005.938292] BUG: soft lockup - CPU#2 stuck for 11s! [zmlocalconfig:11371]
[337005.938306]
[337005.938312] Pid: 11371, comm: zmlocalconfig Tainted: G D (2.6.24-26-xen #1)
[337005.938318] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 2
[337005.938330] EIP is at _spin_lock+0x7/0x10
[337005.938335] EAX: ec64a870 EBX: ec64a870 ECX: 00000002 EDX: ec64a871
[337005.938339] ESI: 00000000 EDI: c03fe800 EBP: c1261e38 ESP: c1261d7c
[337005.938343] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[337005.938357] CR0: 8005003b CR2: 08128000 CR3: 25d8e000 CR4: 00000660
[337005.938364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[337005.938370] DR6: ffff0ff0 DR7: 00000400
[337005.938376] [<c01771f0>] page_lock_anon_vma+0x20/0x30
[337005.938391] [<c01786fd>] page_referenced+0x8d/0x130
[337005.938401] [<c01671cf>] shrink_active_list+0x18f/0x5c0
[337005.938411] [<c0164286>] get_dirty_limits+0x16/0x200
[337005.938421] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache]
[337005.938435] [<c016826d>] shrink_zone+0xdd/0x100
[337005.938444] [<c0168d72>] try_to_free_pages+0x152/0x250
[337005.938453] [<c0162fcb>] __alloc_pages+0x14b/0x390
[337005.938463] [<c01855c5>] do_sync_read+0xd5/0x120
[337005.938475] [<c0163247>] __get_free_pages+0x37/0x50
[337005.938483] [<c0124496>] copy_process+0xa6/0x1210
[337005.938493] [<c0197c34>] d_alloc+0x114/0x1a0
[337005.938503] [<c0125830>] do_fork+0x40/0x260
[337005.938511] [<c0210f00>] copy_to_user+0x30/0x60
[337005.938523] [<c0103226>] sys_clone+0x36/0x40
[337005.938530] [<c0105832>] syscall_call+0x7/0xb
[337005.938542] =======================
[336990.803889] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:103]
[336990.803907]
[336990.803915] Pid: 103, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1)
[336990.803922] EIP: 0061:[<c03286ea>] EFLAGS: 00000286 CPU: 0
[336990.803940] EIP is at _spin_lock+0xa/0x10
[336990.803948] EAX: c1dbc86c EBX: 00000000 ECX: 22cc3000 EDX: 00000000
[336990.803955] ESI: 57b47067 EDI: c1dbc86c EBP: 00000ff0 ESP: ed725dec
[336990.803961] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[336990.803976] CR0: 8005003b CR2: b791e6d9 CR3: 23e3b000 CR4: 00000660
[336990.803986] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[336990.803992] DR6: ffff0ff0 DR7: 00000400
[336990.804001] [<c01773cb>] page_check_address+0x1cb/0x3c0
[336990.804026] [<c017762e>] page_referenced_one+0x6e/0x190
[336990.804039] [<c017875c>] page_referenced+0xec/0x130
[336990.804049] [<c01671cf>] shrink_active_list+0x18f/0x5c0
[336990.804064] [<c0210556>] memmove+0x36/0x40
[336990.804079] [<c0164286>] get_dirty_limits+0x16/0x200
[336990.804089] [<c0139857>] call_rcu+0x97/0xa0
[336990.804102] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache]
[336990.804120] [<c016826d>] shrink_zone+0xdd/0x100
[336990.804132] [<c01688cc>] kswapd+0x44c/0x490
[336990.804145] [<c013bb00>] autoremove_wake_function+0x0/0x40
[336990.804160] [<c011e270>] complete+0x40/0x60
[336990.804172] [<c0168480>] kswapd+0x0/0x490
[336990.804183] [<c013b842>] kthread+0x42/0x70
[336990.804194] [<c013b800>] kthread+0x0/0x70
[336990.804206] [<c0105bb7>] kernel_thread_helper+0x7/0x10
[336990.804218] =======================
-- Best Regards, alex.faq8@xxxxxxxxx
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|