[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: big data structure problem in ocaml



This is the built-in Hashtbl in OCaml, which is described at:
http://caml.inria.fr/pub/docs/manual-ocaml/libref/Hashtbl.html

It's used throughout Mirage, so just grep for 'Hashtbl' in (for example)
the DNS server, and ask any questions you might have about its use here.
Do remember that you want to test out its performance characteristics on
UNIX and not Xen first, to take advantage of local profiling tools.

-anil

On 29 Jan 2013, at 23:53, Yiming Zhang <sdiris@xxxxxxxxx> wrote:

> Thanks Anil, I will try it.
> 
> About "mutable Hashtbl", is there any example ocaml code for it (or I can
> just reference ocaml-openflow suggested by Thomas)? I have done similar
> things by C, but I have no idea how to implement it by ocaml :-(
> 
> Cheers,
> Yimig
> 
> 
> -----邮件原件-----
> 发件人: Anil Madhavapeddy [mailto:anil@xxxxxxxxxx] 
> 发送时间: 2013年1月29日 17:23
> 收件人: Thomas Gazagnaire
> 抄送: Yiming Zhang; cl-mirage@xxxxxxxxxxxxxxx
> 主题: Re: big data structure problem in ocaml
> 
> On 28 Jan 2013, at 19:54, Thomas Gazagnaire <thomas.gazagnaire@xxxxxxxxx>
> wrote:
> 
>>> ********************************
>>> module OrdKey = struct
>>> type t = string
>>> let compare = Pervasives.compare
>>> end
>> 
>> If you keys are always be strings, I would suggest using the optimized
> String.compare.
>> 
>>> module KVMap = Map.Make(OrdKey)
>>> let kv = ref KVMap.empty
>>> ……
>>> (* each time when a new kv pair is set, the following is invoked *) 
>>> kv := KVMap.add !key_in_process line !kv
>>> **********************************
>> 
>> You are not copying kv here, you are just setting the box where you store
> the kv tree with an updated tree, so you can't do much.
>> 
>>> Does anyone have some hint? Or there are example code for similar 
>>> problems in the ocaml world?
>> 
>> What are you testing exactly ? Do you care about fast reads or fast writes
> ? Do you clean-up your cache regularly or is it an ever-growing one (which
> could explain the performance issues).
>> 
>> Some hints: usually a cheap way to build fixed-size cache is to use
> hash-indexed arrays: if you have few hash collision and a rough idea of the
> number of elements you want to store, that's pretty efficient. An example
> here:
>> 
>> https://github.com/samoht/ocaml-openflow/blob/master/controller/learni
>> ng_switch.ml#L41
> 
> Note also that this sort of data structure work is better done in UNIX and
> userspace, where you can use conventional profiling tools to measure the
> size complexity.
> 
> You could also consider trying a normal mutable Hashtbl instead of a
> functional Map, which may operate better depending on your specific
> workload...
> 
> -anil=
> 
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.