[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] dom0 freezing
Hello everyone, I have an extremely annoying freeze problem with Xen that I can't get fixed or at least debugged. It's a bit of a long story. I ordered a x86_64 based coloserver middle of last year to run Xen and a couple of personal domU on it. The box kept freezing all the time, I tried a lot of things to debug it and I could not get a hold of it. The description of this setup is in http://thread.gmane.org/gmane.comp.emulators.xen.user/25347/focus=25500 and http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1007 . Shortly after those mails (middle of July) after my hoster had swapped each and every part in this box they finally replaced the previous VIA based board for one with an AMD/ATI chipset and suddenly the box was rock stable. During the last 10 months I did not have a single crash. It ran with a self-compiled 3.1.0 first, was then changed to a Debian lenny userland and hypervisor, did get a self-compiled dom0 kernel based on Ubuntu Gutsy in January, the fresh Debian 3.2.1 hypervisor end of May. No problems whatsoever. A few days ago the box crashed and did not come back online, even after issueing a hardware reset command. The IP-KVM my hoster connected showed that the box was waiting for a keypress in BIOS saying POST was interrupted before which might be caused by OverClocking (not in use, definitely). When you pressed a key the box booted fine but crashed within minutes, again dying in the BIOS. Definitely a hardware defect. After almost all parts were replaced (CPU, RAM, power supply, fans) the box did not crash in BIOS anymore, but suddenly started to experience the dom0 hangs again. The software setup had not been changed since January (the Gutsy kernel installation) and had been rebooted a couple of times after that for maintenance, so it should definitely be fine. I thought that maybe the board was faulty and got it changed to another one, an nForce 560 based MSI-K9N NEO-F V3. Still, the same crashes. Except for the harddisk the hardware has been completely replaced. I tried changing the dom0 kernel to the Ubuntu Hardy 2.6.24-18-xen distribution kernel, I tried numerous boot options for the Hypervisor (noacpi, nolapic, watchdog) and the dom0 kernel (swiotlb, now trying acpi=off and noapic). The problem is always the same, after some hours the box freezes. There are no error messages in the log or on the console, nothing. I still cannot send the 3*Ctrl-a to the box using the IP-KVM so I can't tell whether dom0 or the hypervisor crashed, but I can tell that nothing whatsoever responds anymore. Does anyone have any idea how to debug this further? Any options I might try to at least better understand this issue? svr01:~# dpkg -l | grep xen ii libxenstore3.0 3.2.1-1 ii linux-image-2.6.24-18-xen 2.6.24-18.32 Linux ii xen-hypervisor-3.2-1-amd64 3.2.1-1 The Xen ii xen-tools 3.9-3 Tools ii xen-utils-3.2-1 3.2.1-1 XEN ii xen-utils-common 3.2.0-2 XEN ii xenstore-utils 3.2.1-1 Bernhard _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |