Xen project Mailing List

Re: [MirageOS-devel] Irmin API newbie questions

To: Thomas Gazagnaire <thomas@xxxxxxxxxxxxxx>

From: Thomas Leonard <talex5@xxxxxxxxx>

Date: Mon, 9 Mar 2015 10:36:57 +0000

Cc: "mirageos-devel@xxxxxxxxxxxxxxxxxxxx" <mirageos-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Mon, 09 Mar 2015 10:37:07 +0000

List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

On 7 March 2015 at 10:49, Thomas Leonard <talex5@xxxxxxxxx> wrote: > On 5 March 2015 at 14:15, Thomas Gazagnaire <thomas@xxxxxxxxxxxxxx> wrote: >>> Should I make a view here instead? It looks like View.of_path does a >>> load of copying. What I want is something that I can be sure will >>> never change. >> >> When you build a view, initially it just reads the head commit. Later, when >> you read elements from it, it lazily fetch and cache the sub-node needed, >> relatively to this commit. If you never write on the view, all the reads >> will be relative to the head commit at the time you created the view. Views >> are designed for this kind of short-lived, non-persistent and isolated >> sequence of computations. > > How do I find out what commit the view was based on? It looks like it > just reads whatever is current for the store, which could have changed > since I checked it. > >>> Neither views nor stores can ensure this I think (both >>> can be written to). Possibly I should do the "I.of_head" inside >>> R.make, but then I'd have to give it the repo (config) and task_maker >>> objects too, which is also ugly. >> >> would be very easy to add a RO view if you think that's useful. >> >>> It seems strange, since the underlying Git model already provides >>> immutable data structures (blobs, trees, commits), but Irmin seems to >>> force me to treat everything as mutable. >> >> well, if you want to persist things you need to mutate something anyway. >> Exposing an immutable API to Git would be a bit cheating as pure function >> would need to have side-effects to create new objects in the backends (when >> you add an element to a tree, you need to create the intermediate sub-node >> in the storage substrates). But others have already expressed some interest >> on having immutable views (see [1]), so I'll see what I can do. >> >> Thomas >> [1] https://github.com/mirage/irmin/issues/109 > > That would be very useful! > > Even pure functions are allowed to allocate data (which can later be > GC'd), so I don't see a problem with allocating objects in the store > in the same way. I've added a somewhat functional Irmin abstraction to CueKeeper now to try this out. Here's the interface I made: https://github.com/talex5/cuekeeper/blob/master/git_storage_s.mli This doesn't provide access to all Irmin's features of course, but it shows everything I'm using and it's probably easier for beginners. The main differences are: - It uses Git terminology (repository, branch, staging area, etc) so it should be easier to understand for people used to Git. - It provides more type safey. For example, you can't try to read from a branch (which could change at any time) - you have to dereference it first to get a commit. - Branch X will always be branch X. A branch can't be switched or detached - you must create a new one instead. - The only way to update a branch is with Branch.fast_forward_to. - You make a commit by writing to a staging area, committing to get a Commit.t, then updating the branch to point to the commit. - The current branch head is provided as a signal. Hopefully some of this can go upstream. The main limitations I found (marked with XXX in the implementation) are: - The LCA calculation is slow (https://github.com/mirage/irmin/issues/160) so I used Thomas's suggestions of limiting the number of common ancestors to 1. However, this could give the wrong answer in some cases. I think Irmin just needs to be smarter about pruning the search when it finds a common node. - I couldn't find a way to make a commit without a parent without using a named branch. The special-case initialisation code could go away if Irmin allowed that. - fast_forward_to checks that the update is a fast-forward to avoid data-loss, but it's not quite atomic. Ideally, Irmin would provide this function itself and lock the backend. (Also, my history API isn't very good. I want someone with a commit to be able to look at the parents, but I didn't see a way to map a graph's nodes from commit IDs to commits. Therefore, I currently turn the history into a flat list of log entries.) -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.