[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] A simple backup


Am Freitag, 9. Mai 2008 15:07 schrieb John Haxby:
> Taking a snapshot of both memory and disk image does work to a large
> extent, but in the case of this message store simply bringing up the old
> memory and disk image suddenly leaves the guest OS wondering what to do
> with all these network connections it used to have -- it will generally
> recover and clean-up its aborted transactions, but, for example,
> messages that were being sent _out_ at the time of the snapshot will be
> resent because the outgoing connection didn't acknowledge the message.
> Depending on what the message is this will vary from the merely annoying
> all the way through to the downright weird.
> I would imagine that there are other application domains where
> restarting a transaction from the restored domU would have rather
> unpleasant side effects.

and i think this is the point. You have to consider what you want.

Fast recovery to a given time with the loss of all computed data since there 
you are fine with saved memory, state and lvm snapshot.
But normally you won't loose any bit, e.g. xen host dies and takes all domUs 
down. You won't recover from a state hours ago, instead you would start the 
domUs. And if you copy the snapshots somewhere it would take a long time to 
put the disk data in place. If you you leave the lvm snapshot, perhaps making 
multiple snapshots it will highly decrease disk io performance.

My normal backup strategy is as follows:
* Let the domU make consistent backups of important data such as databases. A
  lot of (big) applications have commands to let them prepare for backup (let
  them make the data consistent on disk). Or just to write-lock the
  application and sync to disk. This state must only consist a shot period of
  time till the snapshot is created.
* Make lvm snapshots of the disks outside domU while domUs are running (it's
  like turning off the computer/harddisks without stopping the domUs).
* Inside domUs release locks, say application that they can continue
* Mount these snapshots to let the fs make the fs consistent (not the data).
* Backup the files to somewhere (i use rsync with hardlink copies on a logical
  volume on the same volume group).
* Umount snapshots and release snapshots.

With this i have the following options:
* Recover single files from backup without interrupting domU.
* Recover databases with database dumps without interrupting domU.
* If a domU dies unexpectly just start it, the fs should play back journal
  and so the fs is consistent. If a database doesn't come up, take
  the dumps from the backup.
* If domU gets badly destroyed like fs error or a lot of real harddisk
  failures i only have to make new fs for the domU, copy files from backup on
  it and start the domU (this is a desaster recovery). This is very similiar
  to the here discussed backup strategy. But in my experience it is a lot
  faster than handling big dd images or having a lot of snapshots active. The
  only thing i don't have is a running application. But as mentioned here
  already, it could be very useless to have a running cpu managing a
  connection which is discarded long ago.

I do this for about two years and i've made about five desaster recoveries 
(bacause of user failures) and normally i'm asked to bring back single files 
or databases without interrupting the whole domU. Doing a full desaster 
recovery is only an option to me if nothing is left (like deleted/overwritten 

PS: And very new as bonus for my users i've managed to 
include /backup/YYYY-MM-DD/fullfilesystem over sshfs in all domUs so that 
users can easily get files out of /backup without consulting the backup 
operators (well, this is linux only for now).



Attachment: pgpE8oPEbpxFy.pgp
Description: PGP signature

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.