[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Irmin API newbie questions



On 7 March 2015 at 10:49, Thomas Leonard <talex5@xxxxxxxxx> wrote:
> On 5 March 2015 at 14:15, Thomas Gazagnaire <thomas@xxxxxxxxxxxxxx> wrote:
>>> Should I make a view here instead? It looks like View.of_path does a
>>> load of copying. What I want is something that I can be sure will
>>> never change.
>>
>> When you build a view, initially it just reads the head commit. Later, when 
>> you read elements from it, it lazily fetch and cache the sub-node needed, 
>> relatively to this commit. If you never write on the view, all the reads 
>> will be relative to the head commit at the time you created the view. Views 
>> are designed for this kind of short-lived, non-persistent and isolated 
>> sequence of computations.
>
> How do I find out what commit the view was based on? It looks like it
> just reads whatever is current for the store, which could have changed
> since I checked it.
>
>>> Neither views nor stores can ensure this I think (both
>>> can be written to). Possibly I should do the "I.of_head" inside
>>> R.make, but then I'd have to give it the repo (config) and task_maker
>>> objects too, which is also ugly.
>>
>> would be very easy to add a RO view if you think that's useful.
>>
>>> It seems strange, since the underlying Git model already provides
>>> immutable data structures (blobs, trees, commits), but Irmin seems to
>>> force me to treat everything as mutable.
>>
>> well, if you want to persist things you need to mutate something anyway. 
>> Exposing an immutable API to Git would be a bit cheating as pure function 
>> would need to have side-effects to create new objects in the backends (when 
>> you add an element to a tree, you need to create the intermediate sub-node 
>> in the storage substrates). But others have already expressed some interest 
>> on having immutable views (see [1]), so I'll see what I can do.
>>
>> Thomas
>> [1] https://github.com/mirage/irmin/issues/109
>
> That would be very useful!
>
> Even pure functions are allowed to allocate data (which can later be
> GC'd), so I don't see a problem with allocating objects in the store
> in the same way.

I've added a somewhat functional Irmin abstraction to CueKeeper now to
try this out. Here's the interface I made:

https://github.com/talex5/cuekeeper/blob/master/git_storage_s.mli

This doesn't provide access to all Irmin's features of course, but it
shows everything I'm using and it's probably easier for beginners. The
main differences are:

- It uses Git terminology (repository, branch, staging area, etc) so
it should be easier to understand for people used to Git.

- It provides more type safey. For example, you can't try to read from
a branch (which could change at any time) - you have to dereference it
first to get a commit.

- Branch X will always be branch X. A branch can't be switched or
detached - you must create a new one instead.

- The only way to update a branch is with Branch.fast_forward_to.

- You make a commit by writing to a staging area, committing to get a
Commit.t, then updating the branch to point to the commit.

- The current branch head is provided as a signal.

Hopefully some of this can go upstream. The main limitations I found
(marked with XXX in the implementation) are:

- The LCA calculation is slow
(https://github.com/mirage/irmin/issues/160) so I used Thomas's
suggestions of limiting the number of common ancestors to 1. However,
this could give the wrong answer in some cases. I think Irmin just
needs to be smarter about pruning the search when it finds a common
node.

- I couldn't find a way to make a commit without a parent without
using a named branch. The special-case initialisation code could go
away if Irmin allowed that.

- fast_forward_to checks that the update is a fast-forward to avoid
data-loss, but it's not quite atomic. Ideally, Irmin would provide
this function itself and lock the backend.

(Also, my history API isn't very good. I want someone with a commit to
be able to look at the parents, but I didn't see a way to map a
graph's nodes from commit IDs to commits. Therefore, I currently turn
the history into a flat list of log entries.)


-- 
Dr Thomas Leonard        http://0install.net/
GPG: 9242 9807 C985 3C07 44A6  8B9A AE07 8280 59A5 3CC1
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.