[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Domain saving and filesystem corruption
I have been using Xen for over a year now. For the most part I have had very good success with it and we are now working on rolling it out throughout my company. But I just ran across something really annoying and dangerous. When I first started playing with xen I read all of the docs I could find and at that time I am pretty sure xen did not automatically save domains when the machine was shut down. Later on I noticed that it was trying to do so but was failing because the directory to save to did not exist on my machine for some reason (was not created during the install). After that I completely forgot about this behavior. A month or two ago I upgraded to Xen 3.0 from mercurial (I don't have the sources around anymore and I don't see how to get xen to tell me its exact version) and it seems that domain saving on shutdown is now working. Great. I recently had some unrelated system problems which caused me to need to shut down, boot from a rescue disk, and mount the logical volume normally used by my mail server and do quite a bit of work on it. Once done I booted the system normally, xen started the mail domain, and all kinds of weird stuff started happening related to the filesystem. I shut down the domain, did an fsck of the mail server logical volume, and found thousands of errors. Then I realized what had happened. The xen domain was saving state to the disk including internal buffers and who knows what that were not synch'd to the disk. So I mounted a very dirty filesystem, made a bunch of changes, then the mail server domain came back up expecting the fs to be in the same state it was left in and proceeded as if everything were normal which ended up causing massive corruption and many lost emails. Fortunately this is on a dev machine which hosts a bunch of personal domains and other stuff and not business critical things. But it is still highly annoying. I recommend that whenever Xen saves a domain that the domain somehow sync the filesystem state to disk. Ideally the fs would even be marked clean so that if someone needs to mount the fs while the domain is not running such as I did they can. There really needs to be a way for a xen domain, upon being started, to know that the fs is in a sane and consistent state just as it was when it was saved. Ensuring that only filesystems marked clean are left after a save and mounted upon restart is one way to do that. Or is there some sort of time stamp such as a last mount time in the fs that the domain can look at and save with the domain state and make sure that the last mount time has not changed when the domain is restarted? I realize that most of these things are filesystem/OS specific. It would be really nice to have a general solution to this. I think something needs to be done because the current situation seems quite dangerous. For now I have disabled the saving/restarting of domains and will do so on all of our production systems also. It's a risk I just can't take. I mentioned this to someone on the IRC channel and they said "That is documented behavior." Unfortunately that doesn't bring back my data. It wasn't documented when I started using Xen and I can't possibly keep up on everything written about Xen in the meantime. -- Tracy R Reed http://ultraviolet.org _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |