[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-users] Shutting down domU causes "hda: interrupt lost" on dom0 and freezes the box.
I'm not too worried about screwing up the filesystem, but thanks. :-) We're moving offices and I have about three weeks to get this working so I nuke-n-reload as much as needed. Our new telecom provider is able to hand off our voice trunks using SIP, so I'm hoping to use this one, physical server as an all-in-one edge device: Asterisk, SER (a SIP proxy), firewall, and possibly caching Squid proxy. Using Xen will also make it easier to replicate this config at our other offices. I started doing some more testing along the lines you suggested: 1. Issuing a "poweroff" from within a SSH session on the domU works fine. 3. Executing "xm destroy <domU>" causes 'irq 16: nobody cared (try booting with the "irqpoll" option' to be written to the console, followed by a dom0 kernel panic. Yay! 2. Executing "xm save <domU> filename" causes the same HDD errors I previously mentioned. I started to think "resource conflict." I edited grub.conf to pass "acpi=off" to the dom0 kernel, hoping maybe the problem was with how ACPI handled the IRQs, but no luck. I then took a look at the IRQ assignments in the BIOS and discovered that both the RAID controller and the Digium card were assigned to the same IRQ. Trying to change the IRQ on either the card or the RAID controller just caused the other to change as well; I couldn't individually assign them. So I poked around and under "Integrated Devices" in the BIOS and changed "System Interrupts Assignment" from "Standard" to "Distributed." Bam! It now worked. I wasn't getting IRQ conflicts between the storage subsystem and the telephony card. However, I was now I was getting a kernel panic at boot in the domU when it was trying to load the zaptel drivers (needed for the telephony card). So I mounted the LVM volume in xen image using kpartx and manually blew away the zaptel drivers in /lib/modules. Unmounted the image, and booted the domU. This time it booted fine, so I recompiled and reinstalled the zaptel stuff. Rebooted the domU again, and still got a kernel panic. Mainly out of frustration, I then just decided to reboot the physical server. For some reason, that fixed that problem. It now works. No conflicts with the storage controller, no problems shutting down domU, and the card still works fine in Asterisk. Problem solved. The very first computer problem I ever solved was changing the IRQ selection jumpers on an internal modem because it was conflicting with the IO card in my 286. I can't believe the same problem is still stumping me almost 20 years later. -----Original Message----- From: M.A. Williamson [mailto:maw48@xxxxxxxxxxxxxxxx] On Behalf Of Mark Williamson Sent: Tuesday, April 29, 2008 2:21 PM To: xen-users@xxxxxxxxxxxxxxxxxxx Cc: Jamie J. Begin Subject: Re: [Xen-users] Shutting down domU causes "hda: interrupt lost" on dom0 and freezes the box. > When I try to reboot dom0, it switches to runlevel 6 and the xen init.d > script attempts to stop a domU containing an Asterisk installation. It's > at that point I get an "hda: interrupt lost" on the physical console. SSH > become inaccessible and eventually the system pukes up a bunch of ext3 and > RAID controller related errors and freezes. I have to physically power > cycle the box to get it back up. Ugh, that's nasty :-( > I suspect that a PCIe telephony card that I'm passing to the domU using > pciback is the source of the problem. The card is a Digium AEX800 (which > is actually a PCIe version of Digium's PCI-based TDM800P). Based on some > preliminary testing, the card seems to function just fine in the domU. > lspci output is: I was actually just thinking "I wonder if he's using PCI passthrough" ;-) A few thoughts spring to mind: 1) Any idea if this is happening during a normal shutdown of the domU or if that shutdown is timing out, resulting in the domain being rudely destroyed? 2) Is there any chance that the domains are being suspended rather than shutdown? That might do funny things... 3) Does this happen if you manually shutdown the domain? if you manually destroy the domain? It's also possible that this is some kind of bug in the PCI passthrough. I didn't actually know that it worked for PCIe, but it's nice to know that it (sort of) does :-) I apologise for suggesting testing on your own system; I imagine that doing this repeatedly is not doing your filesystem consistency any good :-( It's possible that you'll get more suggestions than I've been able to provide from the xen-devel list. Still, if you feel like doing some testing it may help. It's also possible that there have been other reports like this, although I don't remember hearing of them. Have you done a quick search of the mailing list archives and the bugzilla? (or even just a google, in case someone grumbled about it on their blog). The Asterisk-in-a-domU configuration seems to be rather popular, which is cool :-) Cheers, Mark > > > 0b:08.0 Ethernet controller: Digium, Inc. Unknown device 8002 (rev 11) > > Subsystem: Digium, Inc. Unknown device 8002 > > Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR- FastB2B- > > Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- > <TAbort- <MAbort- >SERR- <PERR- > > Interrupt: pin A routed to IRQ 16 > > Region 0: I/O ports at dc00 [disabled] [size=256] > > Region 1: Memory at fc7dfc00 (32-bit, non-prefetchable) [disabled] > [size=1K] > > Expansion ROM at fc7e0000 [disabled] [size=128K] > > Capabilities: [c0] Power Management version 2 > > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA > PME(D0+,D1+,D2+,D3hot+,D3cold-) > > Status: D0 PME-Enable- DSel=0 DScale=0 PME- > > > > Any suggestions? This is a new Dell PowerEdge 1950 with a PERC SATA RAID 1 > array, running CentOS 5.1 (2.6.18-53.1.14.el5xen) in both the dom0 and > domU. -- Push Me Pull You - Distributed SCM tool (http://www.cl.cam.ac.uk/~maw48/pmpu/) _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |