[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Lockup problem, watchdog, other ways to debug?


  • To: "xen-users@xxxxxxxxxxxxxxxxxxx" <xen-users@xxxxxxxxxxxxxxxxxxx>
  • From: "Andrew Lyon" <andrew.lyon@xxxxxxxxx>
  • Date: Tue, 21 Oct 2008 19:36:29 +0100
  • Delivery-date: Tue, 21 Oct 2008 11:37:09 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=MU7MDCWHBlQtzM9kdY5J+vXZ1GISXeQkPzVePhP2eoLTStR51gGFC4DmxJTRoNbH4+ E4zvMlipMnAp2lYsUMBALHgNc4VNSeIYgzCBJi0fuYiqutf2m/EV+UwnoDNLB0GnuWMF rBtD+nXyJuY25ifQOxic9jqTwlwCnzYxa39dc=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Hi,

I have been trying to figure out why my Supermicro X7DWA-N with dual
Xeon E5420's locks up occasionally when running Xen, I've tried 3.2.1
and 3.3 with the Xensource 2.6.18 kernel, with gentoo 2.6.21, and
2.6.25 and 2.6.27 kernels patched with suse xen patches, every single
combination suffers from occasional lockups :(, I am reasonably
confident that the hardware is ok because memtest86+ runs for days at
a time without a single error, I've also installed Windows 2008 server
and Opensuse 11 (non xen kernel) onto the bare metal and run a
extremely intensive set of tests for several days without a single
problem. The lockup happens more often under load but not exclusively,
I've seen it happen minutes after bootup with no domU's running.

The kernel is usually tainted by nvidia binary drivers but it still
happens in text mode with no nvidia module loaded, I have tried
loading iTCO_wdt and/or hangcheck_timer in the dom0 to get the system
to reboot automatically if it locks up but neither of them has ever
rebooted the system once it has locked up, is there a watchdog timer
in Xen that I could load?

I setup a serial console and can access the dom0 sysrq and the xen
console, but neither are responsive after lockup.

The X7DWA-N board has a connection for NMI button, could that be used
to force a crash in order to get some debug info?

I read that firewire supports dma and can be used to debug a system
after crash or lockup, is this possible with Xen?

I think I have tried everything but any suggestions would be appreciated.

Thanks in advance.
Andy

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.