[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] Segfaults on a migrated guest
Hi *, I run a few Xen hosts (Supermicro X8D*, i.e. Intel Tylersburg) with a few guests. From time to time, strange things happen to me.... this one seems easy to describe: I migrated a PV Linux (Slackware 14.1, kernel 4.12.2) guest from one host (Xen 4.8.1, kernel 4.11.0) to another (Xen 4.9.0, kernel 4.12.4). Shortly after migration some daemons on the guest machine crashed; this is what dmesg shows: [18440203865.237646] Suspended for 2.605 seconds [18440203865.267996] PM: noirq restore of devices complete after 0.212 msecs [18440203865.268170] PM: early restore of devices complete after 0.115 msecs [18440203865.282097] PM: restore of devices complete after 12.602 msecs [18440203865.282167] OOM killer enabled. [18440203865.282168] Restarting tasks ... done. [18440203865.283838] xen:manage: Unable to read sysrq code in control/sysrq[18440203865.379604] dbus-daemon[1233]: segfault at 0 ip (null) sp 00007ffce31aaf10 error 14 in dbus-daemon[400000+61000] [18440203865.381385] ntpd[15191]: segfault at 8 ip 00007f81f24c3dc9 sp 00007ffdee851c90 error 4 in ld-2.17.so[7f81f24b5000+23000] [18440204017.834883] bash[6056]: segfault at 0 ip 00007fe5bd185c2d sp 00007fff675a6b78 error 4 in libc-2.17.so[7fe5bd0f9000+1bf000] [18440204017.865750] sshd[16597]: segfault at 7fa09372afa8 ip 00007fa093517429 sp 00007ffc4b605838 error 7 in ld-2.17.so[7fa093507000+23000] [18440204228.000316] automount[1199]: segfault at 8 ip 00007f46a9a03153 sp 00007f46a975f990 error 4 in libc-2.17.so[7f46a9983000+1bf000] [18440204729.291952] fail2ban-server[4209]: segfault at 0 ip 00007ff8339f7c2c sp 00007ff82f7861c0 error 4 in libpython2.7.so.1.0[7ff833955000+1bf000] What seems suspicious to me are the timestamps: I'm quite sure that none of the machines has been up for more than 500 years. The only other thing I found is that xl dmesg is full of "(XEN) tmem: operation requested on uncreated pool" A different guest (kernel 4.12.3) seems fine so far; dmesg says [18439429870.442886] Suspended for 2.513 seconds [18439429870.443112] PM: noirq restore of devices complete after 0.157 msecs [18439429870.443249] PM: early restore of devices complete after 0.116 msecs [18439429870.464423] PM: restore of devices complete after 19.453 msecs [18439429870.464498] OOM killer enabled. [18439429870.464498] Restarting tasks ... done. [18439429870.466351] xen:manage: Unable to read sysrq code in control/sysrqRespective configurations are at http://camelot.lf2.cuni.cz/vejvalka/temp/reports/20170803/ . Where else should I look, what else should I provide so that the case is worth looking into ? Thanks, Jan _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx https://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |