[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Lockup problem, watchdog, other ways to debug?
Hi, I have been trying to figure out why my Supermicro X7DWA-N with dual Xeon E5420's locks up occasionally when running Xen, I've tried 3.2.1 and 3.3 with the Xensource 2.6.18 kernel, with gentoo 2.6.21, and 2.6.25 and 2.6.27 kernels patched with suse xen patches, every single combination suffers from occasional lockups :(, I am reasonably confident that the hardware is ok because memtest86+ runs for days at a time without a single error, I've also installed Windows 2008 server and Opensuse 11 (non xen kernel) onto the bare metal and run a extremely intensive set of tests for several days without a single problem. The lockup happens more often under load but not exclusively, I've seen it happen minutes after bootup with no domU's running. The kernel is usually tainted by nvidia binary drivers but it still happens in text mode with no nvidia module loaded, I have tried loading iTCO_wdt and/or hangcheck_timer in the dom0 to get the system to reboot automatically if it locks up but neither of them has ever rebooted the system once it has locked up, is there a watchdog timer in Xen that I could load? I setup a serial console and can access the dom0 sysrq and the xen console, but neither are responsive after lockup. The X7DWA-N board has a connection for NMI button, could that be used to force a crash in order to get some debug info? I read that firewire supports dma and can be used to debug a system after crash or lockup, is this possible with Xen? I think I have tried everything but any suggestions would be appreciated. Thanks in advance. Andy _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |