[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] live migration iscsi and lvm



On Friday 29 June 2012 3:29:47 pm Javier Guerra Giraldez wrote:
> On Fri, Jun 29, 2012 at 10:32 AM, John McMonagle <johnm@xxxxxxxxxxx> wrote:
> > Should I be using clvm?
> 
> not necessarily; but it does have some advantages.
> 
> in most VM setups you don't use cluster filesystems (GFS, OCFS2, etc),
> so you should never mount the same volume on two machines.  that
> extends to virtual machines too, so you must not start the same VM
> image on two hosts.  most setups help you enforce this, and live
> migration makes sure the target VM isn't resumed until the original VM
> is no longer working.
> 
> so far, no need for any 'cluster' setup.
> 
> LVM reads all the physical volume, volume group and logical volume
> layouts (what it calls metadata) from the shared storage to RAM at
> startup and then works from there.  It's only rewritten to disk when
> modified (creating/deleting volumes, resizing them, adding PVs, etc).
> 
> That means that if you start more than one box connected to the same
> PVs they'll be able to reach the same VGs and LVs, and if you don't
> modify anything, it would run perfectly.  But, if you want to do any
> metadata change, you have to:
> 
> 1) choose one machine to do the change
> 2) on every other machine, disconnect from the VG (vgchange -a n)
> 3) do any needed change on the only machine that's still connected to the
> VG 4) reread the metadata on all machines (lvscan)
> 
> if you have periodic planned downtimes, you can schedule things and
> work like this; but if can't afford the few minutes it takes, you need
> clvm
> 
> what clvm do is to use the 'suspend' feature of the device mapper to
> make sure no process on no machine perform any access to the shared
> storage until the metadata changes have been propagated.  roughly:
> 
> 0) there's a clvmd daemon running on all machines that have access to
> the VG.  they use a distributed lock manager to keep in touch.
> 1) you do any LVM command that modifies metadata on any machine
> 2) the LVM command asks the clvmd process to acquire a distributed lock
> 3) to get that lock, all the clvmd deamons issue a dmsuspend.  this
> doesn't 'freeze' the machine, only blocks any IO request on any LV
> member of the VG
> 4) when all other machines are suspended, the original clvmd has
> acquired the lock, and allows the LVM command to progress
> 5) when the LVM command is finished, it asks the clvmd to release the lock
> 6) to release the lock, the daemons in every other machine reread the
> LVM metadata (lvscan) and lifts the dmsuspend status
> 7) when all the machines are unsuspended, the LVM command returns to
> the CLI prompt, and everything is running again.
> 
> as you can see, it's the same as the manual process, but since it all
> happens in a few miliseconds, the 'other' machines can be just
> suspended instead of having to be really brought down.
> 
> i guess it could also be done with a global script that spreads the
> 'dmsuspend / wait / lvscan / dmresume' commands; but by the time you
> get it to run reliably, you've replicated the shared lock
> functionality.

Thanks for the information.

I have been trying to get clvm working but it's dependent on cman and have not 
had  much luck figuring out cluster.conf.

I see the new version in testing has no dependency on cman so I think I'll try 
that one.
Looks like will have to upgrade lvm2 also.

I have seen references saying that you can not do snapshots.
And others saying the the snapshot  and snapshotted volume have to be on one 
node.  That I can live with. Is that the case?


John

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.