[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Testing high availability


  • To: "Paras pradhan" <pradhanparas@xxxxxxxxx>
  • From: "Daniel Asplund" <danielsaori@xxxxxxxxx>
  • Date: Sun, 5 Oct 2008 10:54:35 +0200
  • Cc: xen-users@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Sun, 05 Oct 2008 01:55:21 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=K5TdSKVLfFZf2vJl11URtncHtrbzAIVzUozst8Cqc/MGcAzAufQF8IZfKOzZRq8HIN rY5lnT4gyKycOP+bzE/nl6ncvsSXdwtznGEeu7Ey1ModTLIHBXrtUxiymNvMChF89o8U ZEBlgJvdAWlMmAkCyy2OAzjQzT7H1GuNnNGNA=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

>
> Hey all:
>
> It seems like my question is related to ha, drbd and xen . Hence posting to 
> all of them at once.
> I have two nodes setup with xen 3.0.3, drbd82, heartbeat 2 under centos 5.2. 
> As I was testing this cluster for high availibility, I noticed some issues
>
> 1)  domA is running under node1. when I manually shutdown node 1, sometimes 
> it is migrated automatically to node2 and sometimes it is restarted in node2. 
> Why is this happening?
> 2) domA is running under node1. when I pull off the network cable, domA is 
> restarted in node 2 with no problem. But when the node1 comes back, domA is 
> not migrated to node1 and if i do 'xm list' under node1, I see 
> "migrating-domain". This is complicating everything.
>

1) Most likely live migration fails for some reason and therefore the
domA is restarted in node2. Could be a timer issue or a problem with
release of resources. You should be able to see something from the
logs during shutdown on node1.

2) heartbeat on node1 will sense an error and try to migrate domA to
node2 when node1 is up again. But the node2 has already started domA
and you basically have domA running on both nodes. To avoid split
situations like this you should really use a STONITH device that can
reboot the other node, a hardware device connected via serial cable is
most secure, but a cheaper alternative is to use soft stonith device
that can reboot the other node via SSH or telnet. You probably need to
tweak heartbeat as well to allow it to do further checks, for example
test connectivity to your gateway.

Do you have two NICs in both nodes or are you running DRBD, HA and
data traffic over same NIC?

Regards, Daniel
http://www.asplund.nu/xencluster.html

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.