Re: [Xen-users] Disk i/o on Dom0 suddenly too slow

On 09/07/13 13:15, Micky wrote:
To add to my last email.
This happens to be related to LVM snapshot. Every time a snapshot is
created for the LVM where a particular domu is on, load on that domu
spikes up to 30 and things become sluggish.

Ionice and lvm parameter formalities didn't help!

To the the real guys out there:
How do you use LVM snapshots with Xen dom0, if any? To me, it seems
like LVM snapshotting isn't a short-term backup strategy at all!

I've had similar issues, in fact, for the life of the LVM snapshot, performance seems to severely degrade. Usually a single snapshot is ok, but I wanted to have three snapshots, and each day delete the oldest and create a new one.

I've found two "solutions":
1) Make your storage backend perform like a god so that after you take the snapshots performance is like a stroll down the road. (ie, I've upgraded to SSD based storage which can get approx 1.5TB/s write and 2.5TB/s read) .... 2) Only keep a single snapshot, and if possible, remove it as soon as your backup is completed.... and/or keep writes to a minimum while the snapshot is active.

My plan is to do something like this:
1) Have two storage backend machines
2) Use DRBD to sync the two of them (primary sits on RAID device, secondary sits on LVM on RAID device)
3) Use LVM on top of the DRBD to create LV's for each domU
5) Take a snapshot using the underlying LVM (below DRBD) on the secondary
6) Run your backup processes on the snapshot of the DRBD
7) Delete the snapshot

The problem I have is that probably step 6 and 7 might involve disconnecting the backup server from the primary (break the DRBD), and promote it to primary, and make various changes to it (ie, create a split-brain scenario intentionally). After finished the backup process, you may need to invalidate the entire DRBD and re-sync, which could be too time consuming (and itself cause a performance issue).

I haven't yet got that far in the process, so if you do something it would be helpful to hear about it.

Also any other people who can share what they do and what works well/doesn't work would be nice to see.

Finally, the other problem I have with LVM on Debian (stable) is that every week or two, it will freeze on lvremove, and other lvs or LV related commands will freeze. The only solution seems to be a reboot. (Using kernel 3.2.0-4-686-pae #1 SMP Debian 3.2.41-2 i686). I haven't tracked this down or reported it yet, but it is frustrating to have to reboot the dom0 so often.





