[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] bonding combined with network-bridge fails heartbeat cluster on dom0


I have a problem that I cannot resolve and am requesting your help with getting in-touch with the appropriate people to assist.

I have successfully used the configuration instructions and customised XEN bonding script sourced from: http://vandelande.com/guides/howto%20setup%20XEN%20using%20network%20bonding%20on%20SLES10.html (original and modified XEN script attached) and it works well with single servers.

Problem Summary:

I can get any two of the three combinations of XEN, Heartbeat and Bonding working but when all three together are used, the Hearbeat fails to communicate, killing the servers through "split brain condition" -  STONITH.
I initially encountered this problem on HP Blade servers but have since succeeded in recreating the same issue using VMWARE VM' (VM configs attached) so I am fairly confident that it is not a hardware related issue.
This configuration (without hearbeat clustering) works on single servers without issue.
If I do not team the NIC' then both XEN and Heartbeat appear to work as expected so the problem is a combination of all three.

Detailed description:

Two servers running SLES10SP1.
Each has two network cards (Physical restraint on HP blades - hence the desire to use bonding, to increase availability).
The network cards are bonded to create a virtual bond0 interface. (NIC & teaming config files attached)
The two servers are configured to run Heartbeat (configs files attached)
When booted to non XEN kernel both the NIC bonding and Heartbeat work without issue.
When booted to the XEN kernel the heartbeat fails to communicate (protocol: broadcast, multicast or unicast makes no difference) but the servers can communities successfully in all other regards.
I found this message thread on XENSOURCE: http://lists.xensource.com/archives/html/xen-users/2006-12/msg00650.html  ,although the 'work-around' does not appear to work in my case and is otherwise not suitable, although it  does indicate that this issue has been identified previously and not resolved.

Since I now have this problem configuration running under VMWARE I can provide a wealth of scripts, error logs etc (I have attached the VMWARE congif files for the VM servers to facilitate someone recreating my exact configuration)

Nasty Work Around:

I have found that taking any one of the two network cards from the TEAM and configuring them to connect to the same network will facilitate the HB working, although it completely defeats the whole purpose of using the teaming, it does indicate the issue is with the way XEN modifies the bonding driver at startup.

I can also get the servers to work if I do not attempt any bonding but configure the NIC' separately, eth0 for XEN and eth1 for Heartbeat.

Feel free to include my contact details to anyone who you think can assist


Darren Thompson
Professional Services Engineer

Level 24, Santos House
91 King William Street
Adelaide SA 5000

Tel: +61 8 8233 5873
Fax:  +61 8 8233 5911
Mobile: +61 0400 640 414
Mail: darrent@xxxxxxxxxxxxx

Attachment: ifcfg-bond0
Description: Text document

Attachment: ifcfg-eth-id-00:0c:29:e5:82:8a
Description: Text document

Attachment: ifcfg-eth-id-00:0c:29:e5:82:80
Description: Text document

Attachment: network-bridge
Description: application/shellscript

Attachment: network-bridge-bonded
Description: application/shellscript

Attachment: network-bridge-nobond
Description: application/shellscript

Attachment: XEN-HB2-N1.vmx
Description: application/vmware-vm

Attachment: XEN-HB2-N2.vmx
Description: application/vmware-vm

Attachment: ha.cf
Description: Text document

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.