[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] how to debug hardware lockups?


  • To: xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>
  • From: "Rudi Ahlers" <rudiahlers@xxxxxxxxx>
  • Date: Sat, 15 Nov 2008 10:23:07 +0200
  • Delivery-date: Sat, 15 Nov 2008 00:23:47 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=Z36KgjUrfOzMSECGxaYTIhWb3rZ8rJ15nWsn+URMQylsFB4R5onw0Udwzjp84SSJwC 9D8dumF8bz/Rz5VnmgEyr6rKHOksLQAXzp3yqpdD92jZZB3/gOoOK/MjvSXwWqRg78Qw cFmLf/Vx8HtJCN1/Lt1uJU3WeYVBaPdOklOWs=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Hi,

We have a server which locks up about once a week (for the past 3
weeks now), without any warning, and the only way to recover it, is to
reset the server. This causes unwanted downtime, and often software
loss as well.

How do I debug the server, which runs CentOS 5.2 to see why it locks
up? The CPU is an Intel Q9300 Core 2 Quad, with 8 GB RAM, on an Intel
Motherboard

The last few entries before the server froze, is:


Nov 15 07:15:20 saturn snmpd[2527]: Connection from UDP: [127.0.0.1]:59008
Nov 15 07:15:20 saturn snmpd[2527]: Received SNMP packet(s) from UDP:
[127.0.0.1]:59008
Nov 15 07:15:20 saturn snmpd[2527]: Connection from UDP: [127.0.0.1]:47729
Nov 15 07:15:20 saturn snmpd[2527]: Received SNMP packet(s) from UDP:
[127.0.0.1]:47729
Nov 15 07:15:20 saturn snmpd[2527]: Connection from UDP: [127.0.0.1]:47890
Nov 15 07:15:20 saturn snmpd[2527]: Received SNMP packet(s) from UDP:
[127.0.0.1]:47890
Nov 15 07:15:20 saturn snmpd[2527]: Connection from UDP: [127.0.0.1]:50023
Nov 15 07:15:20 saturn snmpd[2527]: Received SNMP packet(s) from UDP:
[127.0.0.1]:50023
Nov 15 07:15:20 saturn snmpd[2527]: Connection from UDP: [127.0.0.1]:58459
Nov 15 07:15:20 saturn snmpd[2527]: Received SNMP packet(s) from UDP:
[127.0.0.1]:58459
Nov 15 10:10:10 saturn syslogd 1.4.1: restart.
Nov 15 10:10:11 saturn kernel: klogd 1.4.1, log source = /proc/kmsg started.
Nov 15 10:10:11 saturn kernel: Bootdata ok (command line is ro
root=/dev/System/root)
Nov 15 10:10:11 saturn kernel: Linux version 2.6.18-92.1.17.el5xen
(mockbuild@xxxxxxxxxxxxxxxxxxxx) (gcc version 4.1.2 20071124 (Red Hat
4.1
.2-42)) #1 SMP Tue Nov 4 14:13:09 EST 2008
Nov 15 10:10:11 saturn kernel: BIOS-provided physical RAM map:
Nov 15 10:10:11 saturn kernel:  Xen: 0000000000000000 -
00000001ef958000 (usable)
Nov 15 10:10:11 saturn kernel: DMI 2.4 present.
Nov 15 10:10:11 saturn kernel: ACPI: LAPIC (acpi_id[0x01]
lapic_id[0x00] enabled)
Nov 15 10:10:11 saturn kernel: ACPI: LAPIC (acpi_id[0x03]
lapic_id[0x02] enabled)
Nov 15 10:10:11 saturn kernel: ACPI: LAPIC (acpi_id[0x02]
lapic_id[0x01] enabled)
Nov 15 10:10:11 saturn kernel: ACPI: LAPIC (acpi_id[0x04]
lapic_id[0x03] enabled)
Nov 15 10:10:11 saturn kernel: ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
Nov 15 10:10:11 saturn kernel: ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
Nov 15 10:10:11 saturn kernel: ACPI: IOAPIC (id[0x02]
address[0xfec00000] gsi_base[0])
Nov 15 10:10:11 saturn kernel: IOAPIC[0]: apic_id 2, version 32,
address 0xfec00000, GSI 0-23
Nov 15 10:10:11 saturn kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 0
global_irq 2 dfl dfl)
Nov 15 10:10:11 saturn kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 9
global_irq 9 high level)

-- 

Kind Regards
Rudi Ahlers

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.