[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xenstored crashes with SIGSEGV



On Thu, 2014-11-13 at 08:45 +0100, Philipp Hahn wrote:
> To me this looks like some memory corruption by some unknown code
> writing into some random memory space, which happens to be the tdb here.

I wonder if running xenstored under valgrind would be useful. I think
you'd want to stop xenstored from starting during normal boot and then
launch it with:
        valgrind /usr/local/sbin/xenstored -N
-N is to stay in the foreground, you might want to do this in a screen
session or something, alternatively you could investigate the --log-*
options in the valgrind manpage, together with the various
--trace-children* in order to follow the processes over its
daemonization.

I'm not sure what the impact on the system would be with this, but I
think it is probably ok unless you have massive xs load.

You'll need a version of valgrind with xen support in it, anything from
the last year or so should do I think.

Other than that we don't really have anyone who is an expert in that
aspect of the C xenstore/tdb who we can lean on for pointers (no pun
intended) etc, so in the absence of some sort of ability to trigger on
demand I'm not sure what else to suggest.

> 1. Has someone observed a similar crash?

I think you are the only one I've seen reporting this.

> 2. We've now also enabled "xenstored -T /log --verbose" to log the
> messages in the hope to find the triggering transaction, but until then
> is there something more we can do to track down the problem?
> 
> 3. the crash happens rarely and the host run fine most of the time. The
> crash mostly happens around midnight and seem to be guest-triggered, as
> the logs on the host don't show any activity like starting new or
> destroying running VMs. So far the problem only showed on host running
> Linux VMs. Other host running Windows VMs so far never showed that crash.

If it is really mostly happening around midnight then it might be worth
digging into the host and guest configs for cronjobs and the like, e.g.
log rotation stuff like that which might be tweaking things somehow.

Does this happen on multiple hosts, or just the one?

Do you rm the xenstore db on boot? It might have a persistent
corruption, aiui most folks using C xenstored are doing so or even
placing it on a tmpfs for performance reasons.

If you are running 4.1.x then I think oxenstored isn't an option, but it
might be something to consider when you upgrade.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.