[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Questions from potential new MirageOS Ocaml user



On 04/05/2018 15:05, Hannes Mehnert wrote:
...> On 14/04/2018 05:20, Luther Flippen wrote:
1) THIS IS A WELL-KNOWN/WELL-DISCUSSED CRITICISM. Ocaml has a GIL (no shared 
memory multicore) and no type classes. Apparently there has been a promise of 
multicore development in progress that has been a long time coming to fruition, 
and that has bred skepticism in at least some. Does multicore (versus 
parallel-distributed) become less relevant as the number of cores per node 
grows? My understanding is, and I have heard it argued that, multicore behavior 
approaches distributed behavior with increasing core number. Is this true? 
Modular implicits are supposed to be on the way too (which are supposed to be 
better than type classes?), but how far off are they for mature 
production-quality use?


The current state is: there is no support for multicore in OCaml.  There
is no support for multicore in the mirage-xen layer, neither in Solo5
(https://github.com/solo5/solo5 - basically a lightweight monitor /
execution environment targeting KVM) -- but there is a fork of Solo5 to
support multicore embedded in https://github.com/RWTH-OS/HermitCore

Several people in Cambridge are working hard on the OCaml side of
multicore, I expect this to be merged mainline within the next 5 years!

Having spoken earlier in the week with two of them -- I think they were hoping that it might be viable to build Mirage on multicore OCaml by the end of the calendar year perhaps. Multicore is now tracking OCaml 4.06.1 and they're starting to think about how to make sure IO is efficient (AIUI).

2) This, I think, is a big missing piece for MirageOS itself (especially in light of 
Ocaml's GIL and the "no-forking" nature of unikernels): If I use MirageOS as my 
development platform for Ocaml, which is what I would prefer, what is the parallel 
distributed computing implementation that MirageOS will use? (Note that this is different 
from Jitsu producing a swarm of application copies for IO demand-response, from what I 
understand of it.) I read where they will base it on the join calculus, but there seems 
to be some question as to what is beyond that specifically. I read JoCaml, CIEL, and Opis 
mentioned as possibilities. I assume this will involve communicating unikernels, spread 
over multiple cores when on a single machine, and possibly scaling up to multiple 
machines beyond that, for a given particular application running in parallel. As a 
scientific programmer I might often want parallel computing capability for any given 
application I deploy. Obviously I will be limited to course grain parallelism mostly, 
this environment being distributed.


Each MirageOS unikernel (as described above) is limited to a single
core.  In order to process data on multiple CPUs at the same time, what
is needed is a task scheduler / load balancer / broker / coordinator
which organises N times the same unikernel, each running on their own
CPU.

...

FWIW we (myself, folk at Nottingham) did submit but get rejected a paper a year or two ago where we had some notion of "self-scaling" unikernels working: the scale-out/-back logic was embedded in the application unikernel, and the Open vSwitch in dom0 was used to divert traffic between unikernel instances on and off host. I could dig out a copy if it were of interest to anyone. (I meant to put it on arxiv but it turns out you need to submit latex source to them now, and it has to build in their latex environment, and life felt too short at the time basically.)

...
4) To what extent can the above two capabilities if present, GPGPU and 
parallel-distributed programming, mitigate the lack of multicore capability, 
especially regarding MirageOS? In this context, what are my options for fast 
unboxed linear algebra computations, especially running on MirageOS? Would 
Lapack routines be viable (which are usually fortran/C/C++)? Obviously shared 
in-place memory manipulation of unboxed arrays is very efficient in this 
context, as in multicore, but can new GPGPU capability compensate? For example, 
the OpenCL BLAS and clMAGMA libraries for OpenCL, and the cuBLAS and NVBLAS 
libraries for CUDA, come to mind, or something similarly able to do linear 
algebra on the GPU. More generally, has the Xen/MirageOS community looked into 
support for the scientific computing community? (I do not mean that they would 
necessarily need/want to compete with the more-niche sub-community of 
professional HPC for speed.) I am not just talking about Big Data input and 
then visualization/exploring/manipulating the data by the way. Some might want 
to run large simulations (physical, biological, etc) of some sort for example.


Mort mentioned owl, I suspect "cross-compiling" (a MirageOS unikernel
(a) runs in ring0, (b) only has a minimal libc available) BLAS and
LAPACK shouldn't be too hard to get up and running since they unlikely
depend on many external symbols.  We do the same for crypto primitives,
libgmp, and openlibm (the julialang port of FreeBSD's libm).  I'm not
sure how much CUDA etc. owl supports, I know there were at least some
experiments in that direction.

...we do also have (lower performance but perhaps not as bad as you might think) native OCaml implementations for the required functions in Owl now, for enabling compilation of Owl programs to Javascript (the `owl-base` package on which `owl` now depends). I believe the student doing that project has successfully built unikernels using `owl-base`...

--
Richard Mortier
richard.mortier@xxxxxxxxxxxx
_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/mirageos-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.