[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] best practices in using shared storage for XEN VirtualMachines and auto-failover?

  • To: <rudi@xxxxxxxxxxx>, xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>
  • From: Jeff Sturm <jeff.sturm@xxxxxxxxxx>
  • Date: Thu, 14 Oct 2010 09:42:46 -0400
  • Cc:
  • Delivery-date: Thu, 14 Oct 2010 06:44:48 -0700
  • List-id: Xen user discussion <xen-users.lists.xensource.com>
  • Thread-index: Actrko+UDZ3L8qcJSdy8A/xlrfJJbgAEBC1A
  • Thread-topic: [Xen-users] best practices in using shared storage for XEN VirtualMachines and auto-failover?

> -----Original Message-----
> From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-
> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Rudi Ahlers
> Sent: Thursday, October 14, 2010 7:25 AM
> To: xen-users
> Subject: [Xen-users] best practices in using shared storage for XEN
> and auto-failover?
> Hi all,
> Can anyone pleas tell me what would be best practice to use shared
storage with
> virtual machines, especially when it involved high availability /
automated failover
> between 2 XEN servers?

With 2 servers, I hear good things about DRBD, if you don't want to go
the SAN route.  If you have a SAN make sure it is sufficiently
redundant--i.e. two (or more) power supplies, redundant Ethernet, spare
controllers, etc.  And of course RAID 10 or similar RAID level to guard
against single-drive failure.

> i.e. if I setup 2x identical XEN servers, each with say 16GB RAM, 4x
1GB NIC's, etc.
> Then I need the xen domU's to auto failover between the 2 servers if
either goes down
> (hardware failure / overload / kernel updates / etc).

Pay close attention to power and networking.  With 4 NICs available per
host, I'd go for a bonded pair for general network traffic, and a
multipath pair for I/O.  Use at least two switches.  If you get it right
you should be able to lose one switch or one power circuit and maintain
connectivity to your critical hosts.

In my experience with high availability, the #1 mistake I see is
overthinking the esoteric failure modes and missing the simple stuff.
The #2 mistake is inadequate monitoring to detect single device
failures.  I've seen a lot of mistakes that are simple to correct:

- Plugging a bonded Ethernet pair into the same switch.
- Connecting dual power supplies to the same PDU.
- Oversubscribing a power circuit.  When a power supply fails, power
draw on the remaining supply will increase--make sure this increase
doesn't overload and trip a breaker.
- Ignoring a drive failure until the 2nd drive fails.

You can use any of a variety of clustering tools, like heartbeat, to
automate the domU failover.  Make sure you can't get into split-brain
mode, where a domU can start on two nodes at once--that would quickly
corrupt a shared filesystem.  With any shared storage configuration,
node fencing is generally an essential requirement.

> What is the best way to connect a NAS / SAN to these 2 servers for
this kind of setup
> to work flawlessly? The NAS can export iSCSI, NFS, SMB, etc. I'm sure
I could even
> use ATAOE if needed

For my money I'd go with iSCSI (or AoE), partition my block storage and
export whole block devices as disk images for the domU guests.  If your
SAN can't easily partition your storage, consider a clustered logical
volume manager like CLVM on RHCS.


Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.