Re: [MirageOS-devel] Irmrn - dedup on sync

Hi Thomas,

Thanks. This helps indeed!


> On Dec 19, 2014, at 1:49 PM, Thomas Gazagnaire <thomas@xxxxxxxxxxxxxx> wrote:
> /cc mirageos-devel as it can be of interested to others
> Hi Gregory,
>> I have a question about Irmin synching architecture. Suppose I have the 
>> email structure where each message is broken into MIME parts so headers, 
>> attachments, body, etc. stored separately. I have two Irmin servers - 
>> primary and backup. Both primary and backup are in sync and have a message M 
>> with attachment A. A new message Mâ with the same attachment A arrives to 
>> the primary. At some point the primary server attempts to sync its content 
>> with the backup server. Will Irmin figure out that the attachment A doesnât 
>> have to be sent to the backup because it already exists? Clearly not sending 
>> A is preferred and in case of mobile client translates into the energy 
>> savings.
> In Git, the puller first sends the list of its references and the 
> corresponding keys. The receiver tries to compute the exact diff to send back 
> to the puller: it uses some heuristic on the graph of keys to do so, and then 
> compress the corresponding contents (i.e., for each missing key, it gets the 
> contents, and then compress all the new contents together). It then sends the 
> pack of contents to the puller. Note that they keys here are the contents' 
> digest, so similar contents are sent only once and only if necessary, 
> although the heuristic can sometimes go wrong. In Irmin, you can use that 
> mode of synchronisation using `Irmin.remote_uri`[1].
> Irmin has also a custom synchronisation protocol which is very similar to the 
> Git one but doesn't do contents compression (so less efficient but more 
> portable across various backends). This is `Irmin.remote_store`[2]
>> Is there a tutorial by any chance that describes how to set up two servers 
>> that sync to each other?
> I've pushed a simple example to the repo[3]. Unfortunately while doing so I 
> discovered two new bugs in my implementation of the Git protocol ...
> Hope it helps,
> Thomas
> [1] http://samoht.github.io/irmin/Irmin.html#VALremote_uri for (1)
> [2] http://samoht.github.io/irmin/Irmin.html#VALremote_store
> [3] https://github.com/samoht/irmin/blob/master/examples/sync.ml
> [4] https://github.com/mirage/ocaml-git/issues/38
> [5] https://github.com/mirage/ocaml-git/issues/39

