[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Re: CPU soft lockup XEN 4.1rc (Solved)



All,

Ha - finally - solved. Guess google is not the answer, searching the
mailing list is. After much frustration I found the following:

http://wiki.debian.org/Xen#A.27clocksource.2BAC8-0.3ATimewentbackwards.27

based on a post by Marco Marongiu

http://my.opera.com/marcomarongiu/blog/2010/08/18/debugging-ntp-again-part-4-and-last

For me lockup solution #2 worked:

# DomU and Dom0
# in /etc/sysctl.conf
clocksource=jiffies
independent_wallclock=0
# then sysctl -p

# in /etc/xen/*.conf
extra="clocksource=jiffies"

And voila - no more lockups, nothing with the motherboards (which I
thought not to be the cause based on success with non-xen configurations)

Not sure if this is a kernel or XEN problem though.

Hope this helps others

On 8/31/2011 2:42 PM, Mark Brown wrote:
> Hello,
> 
> Similar to others I have freezeups on the system, it is consistent with
> high IO load. If the system runs (even with multiple) XenU it does not
> happen. But I can consistently force the situation to occur.
> 
> Running 4 dd processes dumping 20GB each on a LVM/mdadm soft RAID5
> volume it consistenly crashes in a DomU. Running without XEN I do not
> see the problem at all - (e.g. after about 3TB of read/write) nothing
> happened.
> 
> Any suggestion would be very welcome.
> 
> Marc
> 
> [ .. more .. ]
> It appears to be very unpredictable of when it actually occurs, here are
> a few examples. Kind of odd that on Aug29th it always happened on the
> same second ;-{.
> 
>> syslog.2:Aug 29 17:35:47 nwsc-xen-Q45 kernel: [ 2698.560009] BUG: soft 
>> lockup - CPU#0 stuck for 146s! [events/0:9]
>> syslog.2:Aug 29 17:35:47 nwsc-xen-Q45 kernel: [ 2698.561016] BUG: soft 
>> lockup - CPU#1 stuck for 146s! [rsyslogd:2024]
>> syslog.2:Aug 29 22:57:27 nwsc-xen-Q45 kernel: [ 4198.404353] BUG: soft 
>> lockup - CPU#0 stuck for 122s! [md1_raid5:1243]
>> syslog.2:Aug 29 23:07:27 nwsc-xen-Q45 kernel: [ 4798.336110] BUG: soft 
>> lockup - CPU#0 stuck for 101s! [xend:2583]
>> syslog.2:Aug 29 23:07:27 nwsc-xen-Q45 kernel: [ 4798.337007] BUG: soft 
>> lockup - CPU#1 stuck for 101s! [bdi-default:19]
>> syslog.2:Aug 29 23:12:27 nwsc-xen-Q45 kernel: [ 5098.304013] BUG: soft 
>> lockup - CPU#0 stuck for 136s! [blkback.5.xvdd1:7226]
>> syslog.2:Aug 29 23:12:27 nwsc-xen-Q45 kernel: [ 5098.305010] BUG: soft 
>> lockup - CPU#1 stuck for 136s! [sh:7262]
>> syslog.6:Aug 17 12:07:08 nwsc-xen-Q45 kernel: [ 2998.596016] BUG: soft 
>> lockup - CPU#0 stuck for 73s! [xend:2506]
>> syslog.6:Aug 17 12:07:08 nwsc-xen-Q45 kernel: [ 2998.597555] BUG: soft 
>> lockup - CPU#1 stuck for 73s! [md0_raid5:598]
>> syslog.6:Aug 17 12:17:08 nwsc-xen-Q45 kernel: [ 3598.534068] BUG: soft 
>> lockup - CPU#1 stuck for 150s! [xend:2506]
> 
> It does not appear to relate to a specific process. (Those above are
> from Xen 4.0.1 with Debian 2.6.32-5-xen-amd64).
> 
> This one is with Xen 4.1.2-rc2-pre/Debian 2.6.32-5-xen-amd64. Both are
> on Intel DQ45CB board with 4GB ram.
> 
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348062] BUG: soft lockup - CPU#0 
>> stuck for 79s! [xend:2767]
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348073] Modules linked in: 
>> xt_tcpudp xt_physdev iptable_filter ip_tables x_ta                    bles 
>> ext4 jbd2 crc16 sata_sil24 hid_apple sky2 via_velocity crc_ccitt usb_storage 
>> raid456 md_mod async_raid6_recov async_                    pq raid6_pq 
>> async_xor xor async_memcpy async_tx dm_mod ext3 jbd mbcache firewire_sbp2 
>> loop sr_mod cdrom sg xenfs xen_evtc                    hn bridge stp 3w_9xxx 
>> usbhid hid sd_mod crc_t10dif snd_hda_codec_analog snd_hda_intel 
>> snd_hda_codec snd_hwdep snd_pcm_oss                     snd_mixer_oss 
>> snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer 
>> snd_seq_device firewire_ohci psmouse                     i2c_i801 video 
>> firewire_core uhci_hcd ata_piix snd crc_itu_t output serio_raw evdev ahci 
>> pcspkr ehci_hcd i2c_core usbcor                    e nls_base e1000e button 
>> ata_generic soundcore snd_page_alloc libata thermal scsi_mod processor 
>> thermal_sys acpi_processo                   
 
> r
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348219] CPU 0:
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348222] Modules linked in: 
>> xt_tcpudp xt_physdev iptable_filter ip_tables x_ta                    bles 
>> ext4 jbd2 crc16 sata_sil24 hid_apple sky2 via_velocity crc_ccitt usb_storage 
>> raid456 md_mod async_raid6_recov async_                    pq raid6_pq 
>> async_xor xor async_memcpy async_tx dm_mod ext3 jbd mbcache firewire_sbp2 
>> loop sr_mod cdrom sg xenfs xen_evtc                    hn bridge stp 3w_9xxx 
>> usbhid hid sd_mod crc_t10dif snd_hda_codec_analog snd_hda_intel 
>> snd_hda_codec snd_hwdep snd_pcm_oss                     snd_mixer_oss 
>> snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer 
>> snd_seq_device firewire_ohci psmouse                     i2c_i801 video 
>> firewire_core uhci_hcd ata_piix snd crc_itu_t output serio_raw evdev ahci 
>> pcspkr ehci_hcd i2c_core usbcor                    e nls_base e1000e button 
>> ata_generic soundcore snd_page_alloc libata thermal scsi_mod processor 
>> thermal_sys acpi_processo                   
 
> r
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348318] Pid: 2767, comm: xend 
>> Not tainted 2.6.32-5-xen-amd64 #1
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348322] RIP: 
>> e033:[<00007fa4064c0289>]  [<00007fa4064c0289>] 0x7fa4064c0289
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348330] RSP: 
>> e02b:00007fa402ee54a0  EFLAGS: 00000206
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348334] RAX: 0000000001c3a320 
>> RBX: 0000000001f8ace0 RCX: 00007fa40650f844
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348338] RDX: ffffffffffffffe0 
>> RSI: 0000000000000000 RDI: 00007fa4067a9e40
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348341] RBP: 0000000000000000 
>> R08: 0000000000000008 R09: 0000000000000001
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348345] R10: 0000000000000000 
>> R11: 0000000000000246 R12: 00007fa4067a9e40
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348349] R13: 00007fa402ee555c 
>> R14: 00007fa402ee5548 R15: 00000000ffffffff
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348356] FS:  
>> 00007fa402ee6700(0000) GS:ffff880002995000(0000) knlGS:000000000             
>>        0000000
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348360] CS:  e033 DS: 0000 ES: 
>> 0000 CR0: 000000008005003b
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348363] CR2: 00007fb2ed832e28 
>> CR3: 00000000bba8e000 CR4: 0000000000002660
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348367] DR0: 0000000000000000 
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348371] DR3: 0000000000000000 
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348375] Call Trace:
>>
>> Aug 31 13:07:51 nwsc-xen-Q45 init: Id "T1" respawning too fast: disabled for 
>> 5 minutes
> 


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.