Xen project Mailing List

Re: [MirageOS-devel] Irmin API newbie questions

To: "mirageos-devel@xxxxxxxxxxxxxxxxxxxx" <mirageos-devel@xxxxxxxxxxxxxxxxxxxx>

From: Thomas Leonard <talex5@xxxxxxxxx>

Date: Wed, 4 Mar 2015 15:12:29 +0000

Delivery-date: Wed, 04 Mar 2015 15:12:43 +0000

List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

On 24 February 2015 at 13:37, Thomas Leonard <talex5@xxxxxxxxx> wrote: > On 24 February 2015 at 13:34, Nick Betteridge <buzzheavyyear@xxxxxxxxxxx> > wrote: >> >>> >>> Note: it's probably worth removing "Irmin inside the browser" from the >>> Pioneer Projects list now. It's working fine (with HTML5 storage). >>> Getting it to work is just a case of modifying Irmin to use Daniel >>> BÃnzli's pure-OCaml implementation of SHA-1 and implementing a few JS >>> helpers for blitting with bigarrays. >>> >> >> Is this viewable anywhere? It would be great to have this functionality >> built into Irmin + extension. > > Yes, although I'm planning on doing more testing before upstreaming. > However, for the curious the current versions can be found here: > > HTML 5 storage backend for Irmin: > > https://github.com/talex5/cuekeeper/blob/master/html_storage.ml I've now extended this to support events, so if the user has the page open in two tabs they stay in sync. However, there's still the problem that you can't remove a watch, so once you start watching you have to keep on doing so. Not a problem for my app (I just watch the master branch) but might be a problem in general. I've also now made my app do merges: every item displayed on the screen records the Irmin commit from which it came. Usually this is the latest version, but if you're e.g. editing a form then it's the version when you started the edit. When you submit a change, it commits against that revision and then merges the result back to master. This means that if you make two different edits to something in two tabs, the conflict will be detected instead of silently discarding the first change. (Ideally I want to resolve the conflicts, but for now it just discards the second change and reports the error to the user). I'm not sure I'm using the API correctly though. Here's my code for monitoring the master branch and updating the display when it changes: https://github.com/talex5/cuekeeper/blob/83a5e3c146f2406b8491b1c9915cb7fef6753869/ck_update.ml#L30 let make ~on_update branch = let watch_tags = I.watch_tags (branch "Watch master") in get_head_commit branch >>= fun initial_head -> let mutex = Lwt_mutex.create () in match I.branch (branch "Get branch name") with | `Head _ -> failwith "Not a tag!" | `Tag branch_name -> let t = { branch; head = initial_head; updated = Lwt_condition.create (); mutex; } in async (fun () -> on_update >>= fun on_update -> watch_tags |> Lwt_stream.iter_s (function | (n, Some _commit) when n = branch_name -> (* (ignore the commit ID in the update message; we want the latest) *) Lwt_mutex.with_lock t.mutex (fun () -> I.head (branch "Get latest commit") >>= function | None -> Ck_utils.error "Branch '%s' has disappeared!" branch_name | Some head -> if head <> t.head then ( t.head <- head; on_update head >>= fun () -> Lwt_condition.broadcast t.updated (); return () ) else ( return () ) ) | _ -> return () ) ); return t Some points you have to worry about here: - Irmin makes it easy to read data directly from a branch (tag), but this is usually unsafe, since a tag can update while you're reading it. Therefore, I think I always have to first get the head commit, then create a new store for that revision, then read from that. It might be safer if a branch just provided a way to get a commit, and reads had to be done from a fixed commit. - I only need to be notified once after a series of changes. I discard the commit in the watch notification and always read the latest. Otherwise, I might update the GUI to show stale data. Since a merge can add many commits at once, the stream API doesn't seem useful (it still doesn't ensure you see each commit individually, if that's what you wanted). - The functions that update the store shouldn't return until the update has been see by the model ("on_update" has been called). I therefore create a mutex and a Lwt_condition so that the commit functions can be sure their change has been seen. - I was using React to track things, but its use of weakrefs caused a lot of problems, so I've gone back to imperative code and callbacks for now. All the code for changing the model goes via "merge_to_master". This takes a "base" argument (the commit which the user made the change from), creates a view (temporary branch), runs "fn" to make the requested change (which should always succeed), merges the temporary branch back to master, and waits for the on_update to notice it: let merge_to_master t ~base ~msg fn = let path = I.Key.empty in R.make_view base >>= fun view -> fn view >>= fun result -> Lwt_mutex.with_lock t.mutex (fun () -> (* At this point, head cannot contain our commit because we haven't made it yet, * and no updates can happen while we hold the lock. *) let old_head = t.head in let updated = Lwt_condition.wait t.updated in R.V.merge_path (t.branch msg) path view >>= function | `Conflict msg -> Ck_utils.error "Conflict during merge: %s (discarding change)" msg | `Ok () -> I.head (t.branch "Get latest commit") >|= function | Some new_head when new_head <> old_head -> (* [updated] cannot have fired yet because we still hold the lock. When it does * fire next, it must contain our update. It must fire soon, as head has changed. *) updated | _ -> (* Our change had no effect, so don't wait for head to move. * Or, the branch was deleted - no point waiting then either. *) return () ) >>= fun updated -> (* Changes have been committed. *) updated >>= fun () -> (* [on_update] has been called. *) return result Does this approach seem sensible? I feel like the API perhaps wasn't intended to be used this way... I'm still worried about losing data due to races. JavaScript (I think) ensures that access to local storage from different threads doesn't get interleaved, but since Irmin uses Lwt we don't benefit from this. I suspect that if two tabs try to merge at once then one's commit will be silently discarded. Probably the back-end update function should take the expected old value as an argument so it can abort if the state has changed since then. It would also be useful to get the commit object from the merge before updating the branch. That way, I could try loading it as a sanity check (and thus reject any update that would result in an unreadable state). I guess I need to create another temporary branch, merge to that, then force-update master to it. Seems like a lot of work, though. Anyway, despite these minor issues, Irmin is working great so far! -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.