Hi,
I am trying to build a rock solid XEN High availability cluster. The platform is SLES 11 SP1 running on 2 HP DL585 both connected through HBA fiber channel to the SAN (HP EVA).
XEN is running smoothly and I’m even amazed with the live migration performances (this is the first time I have the chance to try it in such a nice environment).
XEN apart the SLES heartbeat cluster is running fine as well and they both interact nicely.
Where I’m having some doubts is regarding the storage layout. I have tried several configurations but each time I have to compromise. And here is the problem, I don’t like to compromise ;)
First I’ve tried to use a SAN LUN per Guest (using directly the multipath dm device as phy disk ). This is working nicely, live migration works fine, easy setup even if the multipath.conf can get a bit fussy with the growing number of LUNs : but no fencing at all, I can start the VM on both node and this is BAD!
Then I’ve tried to used cLVM on top of the multipath. I’ve managed to get cLVM up and running pretty easily in the cluster environment.
From here to way of thinking:
1. One big SR on the SAN split into LV that I can use for my VM. A huge step forward flexibility, no need to reconfigure the SAN each time… Still with this solution the SR VG is open in shared mode between the nodes and I don’t have low level lock of the storage. I can start a VM two time and this is bad bad bad…
2. In order to provide fencing at the LVM level I can take another approach: 1 VG per volume an open it in exclusive mode. The volume will be active on one node at a time and I have no risk of data corruption. The cluster will be in charge of balancing to volume when migrating VM from one node to the other. But here the live migration is not working, and this S…
I was wondering what approach others have taken and if they is something I’m missing.
I’ve looked into the XEN locking system but from my point of view the risk of dead lock is not ideal as well. From my point of view a DLM XEN locking system will be a good one, I don’t know if some work have been done in the domain?
Thanks in advance
Herve