[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Scary!!! Lost domU!!!
James Pifer wrote: On Fri, 2010-01-01 at 14:07 -0500, Jamon Camisso wrote:Is there more than just the sles server using both volumes? If not, have you considered using another filesystem? Personally I've had nothing but trouble with ocfs2 in Debian and Centos -- clusters would just randomly fall apart. I've also found that unless filesystem throughput is very good, ocfs2 would end up loosing writes by getting ahead of itself somehow. All depends on the storage backend I suppose.I think I know what happened in this case. After a lot of thought, I believe the blunder was mine. I remember working with this specific domU in early December. I was moving it from my dev machine with local storage to the cluster. I did not realize how much space it was actually using, so after copying I decided it would best to leave it on localstorage since it was not a super critical system.Here's when I'm speculating. Somewhere along the way I think I screwed up and did bring the domU up on the ocfs2 cluster or I had already modified the config. I then started it back up before deleting the one I just copied. I then tried to delete the copy on ocfs2 while it was running. Not sure why I may have stopped here when it did not delete, maybe side tracked, don't know. In any case I'm thinking they weremarked for deletion.Then after Christmas I had to reboot the server for a different problem. When I stopped the domU, or during reboot, the file deletion actually took place. Thankfully I still had a copy of it. Wouldn't have been theend of the world except for work rebuilding it.I'm not sure if that is even possible but that's what I'm thinking. Other than that my ocfs2 cluster has been solid on sles. Been using itfor quite some time, well over a year I think. That sounds plausible. I could see doing the same thing pretty easily. I use xm migrate (live) to make sure that there's only ever one copy of a domU running anywhere. That way I can definitively check from the dom0 which filesystem is being used too -- it must get messy with different storage pools, lvm volumes, raw tap:aio files etc. The one doubt I have is the timeline involved. I suppose it is possible that the domU continued merrily along with a filesystem that was loosing writes for the rest of the month (a couple weeks?), it's too bad there isn't a copy of the filesystem around where you could see the logs to confirm it! Good to hear you've got a backup and that you haven't had problems since the reboot :) Jamon _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |