[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
Hi everyone, did the information that Ingo provided (i cited his message to the list below) maybe help in narrowing down the possible issue? If you need additional information we can try getting it for you, as Ingo might be able to reproduce the kernel panic, although not reliably. by the way, Ingo and i compared the output of "lspci" on both our servers and they have no similar hardware other than a Xeon CPU, but mine is an earlier generation than the other one - maybe this rules out driver-related problems. Andreas. Am 10.07.2016 um 15:18 schrieb Ingo Jürgensmann: > Am 10.07.2016 um 00:29 schrieb Andreas Ziegler <ml@xxxxxxxxxxxxxxxxxx>: > >> In May, Ingo Jürgensmann also started experiencing this problem and >> blogged about it: >> https://blog.windfluechter.net/content/blog/2016/03/23/1721-xen-randomly-crashing-server >> https://blog.windfluechter.net/content/blog/2016/05/12/1723-xen-randomly-crashing-server-part-2 > > Actually I’m suffering from this problem since April 2013. Here’s my story… ;) > > Everything was working smoothly when I was still using a rootserver at > Hetzner. The setup there was some sort of non-standard, as I needed to have > eth0 as outgoing interface not being part of the Xen bridge. So I used a > mixture of bridge and routed in xend-config.sxp. This setup worked for years > without problems. > > However: as Hetzner started to bill for every single IPv4 address, I moved to > my new provider where I could get the same address space (/26) without being > forced to pay for every IPv4 address. The server back then was a Cisco C200 > M2. > > Since I got my own VLAN at the new location, I was then able to dismiss the > mixed setup of routing and bridging and used only bridging with eth0 now > being part of the Xen bridge. The whole setup consists of two bridges: one > for the external IP addresses (xenbr0) and one for internal traffic (xenbr1). > This was already that way with Hetzner. > > However, shortly after I moved to the new provider, the issues started: > random crashes of the host. With the new provider, who was and is still very > helpful, we exchanged for example the memory. The provider reported as well > that other Cisco C200 server with Ubutu LTS didn’t show this issue. > > Over time a pattern showed up that might cause the frequent crashes > (sometimes several times in a row, let’s say 2-10 times a day!): > > My setup is this: > > Debian stable with packaged Xen hypervisor and these VMs: > 1) Mail, Database, Nameserver, OpenVPN > 2) Webserver, Squid3 > 3) Login server > 4) … some more servers (10 in total), e.g. Tor Relay… > > IPv4 /26 network, IPv6 /48 network > > From my workplace I need to login to 3) and have a tunnel to the Squid on 2) > via the internal addresses on xenbr1. Of course Squid queries the nameserver > on 1), so there is some internal traffic going back and forth on the internal > bridge and traffic originating from the external bridge (xenbr0). Using Squid > I access my Roundcube on my small homebrew server that is connected to 1) via > OpenVPN. Of course the webserver on 2) queries the database on 1) > > So, the most crashes do happen while I’m using the SSH tunnel from my > workplace. If a crash happen, it’s most likely that at least two in a row > will happen in a short time frame (within 1-2 hours), sometimes even within > 10 mins after the server came back. From time to time my impression was, that > the server crashes the second time instantly when I try to access my > Roundcube at home. > > Furthermore, I switched from using the Cisco C200 server to my own server > with Supermicro X9SRi-F mainboard and a XEON E5-2630L V2, but still the same > provider, and the same issue: the new server crashes the same way as the > Cisco server did. With the new server we did a replacement of the memory as > well: from 32G to 128G. So over time we have switched memory twice and > hardware once. Since then I don’t assume anymore that this might be hardware > related. > > In the meantime I switched from using Squid on 2) to tinyproxy running on 2) > as well as running tinyproxy on another third party VPS. Still the crashes > happen, regardless of using Squid on 2) or not. > > In May the server crashed again several times a week and several times a day. > Really, really annoying! > So together with my provider we setup a netconsole to catch some more > information about the crash than just the few lines from the IPMI console. > > Trying linux-image 4.4 from backports didn’t help either. I switched from PV > to PVHVM as well some months ago. > >> He is pretty sure, that the problem went away after disabling IPv6. > > Indeed. Since I disabled IPv6 for all of my VMs (it’s still active on dom0, > but not routed to the domUs anymore) no single crash happened again. > >> But: we can't say for sure, because on our server it sometimes happened >> often in a short period of time, but then it didn't for months. >> and: disabling IPv6 is no option for me at all. > > I won’t state that I have an exact way of reproducing the crashes, but it > happens fairly often when doing as described above. > > What I can offer is: > - activate IPv6 again > - install a kernel with debugging symbols (*-dbg) > - try to provoke another crash > - send netconsole output if happened > > What I cannot do: > - interpret the debug symbols > - access IPMI console from workplace (firewalled) > > I’m with Andreas that disabling IPv6 cannot be an option. > > -- > Ciao... // http://blog.windfluechter.net > Ingo \X/ XMPP: ij@xxxxxxxxxxxxxxxxxxxxxxxx > > gpg pubkey: http://www.juergensmann.de/ij_public_key.asc > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > https://lists.xen.org/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |