[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Wg-test-framework] colo outage



On Mon, 27 Apr 2015 05:24:51 -0400
Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:

> I noticed over the weekend that newcastle (the primary of the two vm
> servers in the test lab) was not accessible via ssh, ping, etc.
> Therefore neither was the main osstest vm and most of the service VMs
> 
> Ian J is away with limited net access so I have rebooted it via the PDU
> and it has rebooted. Attempts we made to restore (rather than start) a
> bunch of vms, suggesting it may have been gracefully shutdown by
> someone? syslog has:
> 
> Apr 24 19:06:26 newcastle shutdown[28059]: shutting down for system halt
> 
> Anyone want to fess up? 

Not I. The only user logged in at the time was iwj and one of mine. I don't
see either of us doing it.

> Perhaps we should "apt-get install molly-guard"?

Wouldn't hurt. If we had room.

> During the restore several of the associated xl processes crashed with
> the logs below and the domains were left paused.
> 
> It is still restoring the final domain and I have to head out for
> dinner. When I get back I will destroy all of the domains and restart
> the set mentioned in /etc/xen/auto.
> 
> Also, the disk was completely full. Likely with all those saved VM
> images? /var/lib/xen/save is 17GB.

Those look like they occurred during shutdown, given the time stamps.
Shutdown started at 19:09 and the dates on the save files are 19:10 to
19:12.

> I'd not be surprised if a disk full error lead to the crashes too.

Unless the saves were triggered before the shutdown, I'd say not. I'm
guessing they are the result of the shutdown.

In any case, I find those save files suspect, especially the ones with
the 19:12 timestamps, as possibly being incomplete (having run out of
space).

Maybe we should increase the size of newcastle's root partition; it's only
20 GB, now.

> 
> Ian.

-d

_______________________________________________
Wg-test-framework mailing list
Wg-test-framework@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/wg-test-framework


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.