[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] remus: implement remus replicated checkpointing disk

On Tue, 2014-03-11 at 11:10 -0700, Shriram Rajagopalan wrote:
> On Tue, Feb 25, 2014 at 6:53 PM, Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> wrote:
>         This patch implements remus replicated checkpointing disk.
>         It includes two parts:
>           generic remus replicated checkpointing disks framework
>           drbd replicated checkpointing disks
>         They will be split into different files in next round.
>         The patch is still simple due to disk-setup-teardown-script is
>         still under implementing. I need to use libxl_ao to implement
>         it,
>         but libxl_ao is hard to use. The work sequence is needed to
>         ugly split
>         to serveral callbacks like device_hotplug().
>         And becuase the remus disk script is unimplemented, the
>         drbd_setup() code
>         can't check the disk now. So it just assumes the user config
>         the disk correctly.
>         This patch is *UNTESTED*.
>         (there is a problem with xl&drbd(without remus) in my BOXes).
>         I request *comments* as many as possible.
>         Thanks,
>         Lai
>         Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> Hi
>  sorry for the delayed response. And thanks a lot for this initiative.
> Apart from the inline feedback,
> there are a few things to consider first before going down this
> route. 
> 1. The drbd kernel module required for Remus is still out of tree,
> currently hosted on a wiki page.
> The drbd folks didnt want to include the changes into their code
> unfortunately, as they were offering the
> same functionality to one of their paid customers. This is what they
> told me back in 2011 or so.

That's rather sad.

Is there a more community contribution friendly project which provides
similar functionality? A community drdb fork perhaps?

I don't know how invasive the changes are, but one approach might be to
ask various distro package maintainers if they would be willing to carry
a patch which you maintain out of the main drdb tree. You'd only need a
few of the big ones to say yes for this to be worthwhile.

> To streamline the storage replication module installation, is there a
> chance of hosting the code in 
> xen.org's repos? That way, we could script the download and
> installation process. Like the qemu
> stuff.

I'm very reluctant to add more downloading to the Xen build system, but
that doesn't rule out hosting something on xenbits. There are also
things like gitorious and other hosting services.

> 2. The tapdisk based replication unfortunately is outdated. Please
> correct me if I have got this wrong.
> Haven't we decided to get rid of blktap2 and go with the qemu disk
> models?

"decided" in so much as noone is interesting in maintaining blktap2.
qemu is where people are willing to invest the effort so that is where
things are heading.

>  In which case, the tapdisk
> remus code has to be ported into some qemu disk variant.

Right. I think qemu has some amount of snapshot stuff, but how close it
is to what remus wants I don't know.

> Without getting a resolution to the above two, my stance is that we
> shouldn't pollute xl with functionality
> that requires out-of-band modules that may prove pretty painful to
> install for the majority of folks out there.

This sounds reasonable.

> Based on the experience from the last 3 years, most average users of
> Remus tend to skip disk replication 
> altogether.  They install the distro's default drbd, use the disk
> replication provided with it and then complain
> that Remus crashes or fails.

Remus should probably complain more stridently about the lack of disk
replication and require a --i-know-my-data-is-at-risk type flag.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.