[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes

On Tue, 2016-01-19 at 09:01 +0800, Wen Congyang wrote:
> On 01/19/2016 12:51 AM, Ian Campbell wrote:
> > On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
> > > For example: if the secondary host is down, and we fail to send the
> > > data to
> > > the secondary host. xc_domain_save() returns 0. So in the function
> > > libxl__xc_domain_save_done(), rc is 0(the helper program exits
> > > normally),
> > > and retval is 0(it is xc_domain_save()'s return value). In such case,
> > > we
> > > just need to complete the stream.
> > 
> > What if the secondary host isn't actually down but just communication
> > has
> > failed for some reason? Won't both primary and secondary start their
> > respective versions of the domain? What are the consequences of that?
> > (Corruption?)
> > 
> > I suppose this is a consequence of the lack of STONITH or splitbrain
> > handling within Remus. Are there any plans to address this?
> IIRC, Shriram Rajagopalan has some ideas about it(check the external 
> heartbeat?).
> There is no way to avoid splitbrain unless we have more than two hosts(at 
> least
> three hosts). If we want to avoid splitbrain, we may need to destroy both 
> primary
> and secondary guests.

I think there's plenty of existing systems for taking care of this side of
fault-tolerance/HA (e.g. linux-ha, Pacemaker, Corosync, etc), we don't need
(or want) to reinvent that particular wheel here.

I think we just need a story on how one would integrate with such a system
in order to say that Remus is properly usable in real world scenarios (i.e.
before we can remove the "proof-of-concept" wording from the man page).

That might just be a documentation exercise, or it might require some hooks
etc adding to (lib)xl in order to allow such integrations, I'm not sure
what's needed.

IIRC Ian expressed a similar sentiment when Remus support was first added
to libxl.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.