[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Irmrn - dedup on sync

/cc mirageos-devel as it can be of interested to others

Hi Gregory,

> I have a question about Irmin synching architecture. Suppose I have the email 
> structure where each message is broken into MIME parts so headers, 
> attachments, body, etc. stored separately. I have two Irmin servers - primary 
> and backup. Both primary and backup are in sync and have a message M with 
> attachment A. A new message Mâ with the same attachment A arrives to the 
> primary. At some point the primary server attempts to sync its content with 
> the backup server. Will Irmin figure out that the attachment A doesnât have 
> to be sent to the backup because it already exists? Clearly not sending A is 
> preferred and in case of mobile client translates into the energy savings.

In Git, the puller first sends the list of its references and the corresponding 
keys. The receiver tries to compute the exact diff to send back to the puller: 
it uses some heuristic on the graph of keys to do so, and then compress the 
corresponding contents (i.e., for each missing key, it gets the contents, and 
then compress all the new contents together). It then sends the pack of 
contents to the puller. Note that they keys here are the contents' digest, so 
similar contents are sent only once and only if necessary, although the 
heuristic can sometimes go wrong. In Irmin, you can use that mode of 
synchronisation using `Irmin.remote_uri`[1].

Irmin has also a custom synchronisation protocol which is very similar to the 
Git one but doesn't do contents compression (so less efficient but more 
portable across various backends). This is `Irmin.remote_store`[2]

> Is there a tutorial by any chance that describes how to set up two servers 
> that sync to each other?

I've pushed a simple example to the repo[3]. Unfortunately while doing so I 
discovered two new bugs in my implementation of the Git protocol ...

Hope it helps,

[1] http://samoht.github.io/irmin/Irmin.html#VALremote_uri for (1)
[2] http://samoht.github.io/irmin/Irmin.html#VALremote_store
[3] https://github.com/samoht/irmin/blob/master/examples/sync.ml
[4] https://github.com/mirage/ocaml-git/issues/38
[5] https://github.com/mirage/ocaml-git/issues/39

MirageOS-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.