[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Updates on Tarides Plans with MirageOS - Request for Feedback



Hi,

Thanks for your answers. I’m trying to give more details bellow.

I am a bit worried, since the mirage3 -> mirage4 change (change of compilation strategy) had quite some impact on our reproducible build system (https://builds.robur.coop, https://github.com/robur-coop/orb) - now when opam-monorepo will be deprecated in favour of "dune pkg", my worry is that there's again quite some work needed on our end. It turns out, only mirage 4.2 was in a shape where we could use it for reproducible builds.

But I'm sure you're aware of the issues that we encountered and pushed upstream to e.g. mirage, and how all the bits (mirage/opam/orb/opam-monorepo) currently fit together, and how "dune pkg" will fit in.

This is a legitimate worry. To avoid falling in the same traps a for the opam-monorepo situation:
- the feature is designed by the Dune maintainers, with support and contributions from the opam-monorepo and opam maintainers. So the maintenance story is much clearer and upstream are willing to do the changes to make this work properly.
- the feature is expected to be used by all OCaml users (not just the Mirage and a few others). So there is a better incentive to make it succeed than opam-monorepo.
- the goal is to teach Dune how to compile any opam package, even it it doesn’t use Dune (so no need for an overlay and vendoring anymore)

I expect the situation to  probably be broken a little bit in the first alpha as it’s a major change but to be improved quickly as users start to pick it up and to be maintained consistently over-time as it’s a first-class feature in OCaml Platform UX.

You can browse the “package management” tag on GH to see progress: https://github.com/ocaml/dune/issues?q=label%3A%22package+management%22. There is a lot going on, but here a few highlights:
- how Dune is planning to build opam packages (with support for x-compilation via Dune workspace, like opam-monorepo does): https://github.com/ocaml/dune/issues/7096
- how all the opam features are integrated into Dune build plans: https://github.com/ocaml/dune/issues/8096
- Keep the tag/commit of opam-repository remote in the lock files for reproducibility: https://github.com/ocaml/dune/issues/8463
- How to make the Dune rules for building opam packages reproducible: https://github.com/ocaml/dune/issues/8240

I don’t think the integration with Orb has been discussed in details at this stage - however `dune pkg` uses the opam library and already need to have precise dependencies specification to make caching of build rules work reliably. So I suspect that won’t be too difficult; but yes I realize this means planning work to update orb. I’ve opened https://github.com/ocaml/dune/issues/9548 to discuss the scope of the work.

Regarding mirage/functoria, my general feeling is that while the CLI tool was initially valuable for gathering an ecosystem of libraries, nowadays, it is less clear if this is still required. Right now, most of the tool's complexity is handling the installation of packages needed for a specific target/combination of devices. This will no longer be needed if the build system can do this instead. Ideally, any OCaml application (following a few design principles) could be compiled to a unikernel simply using Dune, as envisioned by the [Workflow W12](https://ocaml.org/docs/platform-roadmap#w12-compile-to-mirageos) of the OCaml Roadmap. However, there is no existing design on how this should work at this stage. So, before starting this, is that the right direction for the mirage tool?
My experience with Mirage - the tool - is that it does various things at once:
- figuring out dependencies and requirements of MirageOS devices, putting them into a boot order (what is generated as main.ml)
- selecting target-specific implementations (still, I thought someone wanted to revise this to use "dune variants", but haven't seen any demonstration thereof)
- command line arguments at configure and boot time (here, there has been various recent discussions, https://github.com/mirage/mirage/issues/1422, and a huge amount of reshuffling and implementation work by yourself -- which unfortunately doesn't seem to be ready yet (merged onto the main branch, but take a look at the regressions https://github.com/mirage/mirage/issues/1479 https://github.com/mirage/mirage/issues/1483), and looks slightly abandoned
- cross-compiling/linking (using ocaml-solo5 with solo5 cross-compilation shell-script, and opam-monorepo to construct a monorepo)

Out of these 4 items, I'm not sure what "dune -x mirage" will attempt to solve. My goal is to make mirage - the tool - less smart about what it attempts to achieve, but I don't think that moving these bits into dune would be beneficial.

Good list :-)
1. I agree we need some kind of metadata here. But I’m not sure having a complex eDSL is the right approach anymore. We might as well extract the metadata (what packages exists, what devices do they define, what parameters do they take) and a more simple way to combine then. I’m not convinced we need to expose this to end-users in the way it is today.
2. Dune variant was indeed supposed to fix this - it has a few limitations (the main one being the removal of x-modules inlining) but my hope is that it is the way to do multi-platform development in OCaml in the longer term.
3. I am not convinced that exposing a complex CLI is the way to go. I’d be in favor of letting people write configuration files (so you can store them in your repo), either using a standard format (did I hear yaml :p) or just put this in your dune file. But this is long-term. In the short term I plan to unblock my patches by finding some time over Christmas to work on this. Longer-term, we probably need a combination of macros/MetaOCaml instead of re-implementing our own magic. Would be nice to explore the design space a bit more here.
4. This is exactly what `dune -x mirage` will replace initially (with maybe some integration for target/device parameters in 3)

### Targets
The principal target backend for MirageOS nowadays is Solo5. This is a solid backend, which has been audited and optimised for security. It is also relatively simple to add new devices given the by-design low-complexity approach of its device model. However, while solo5 is today the most secure unikernel "runtime", I also feel it has issues hindering potential changes. For one, it is slow -- the device model is not meant for high-speed I/O,
There have been contributions and attempts to implement that - see https://github.com/solo5-netmap/solo5/tree/netmap as example. I don't quite get the "has issues hindering potential changes”.

I’ve listed a few of these issues later (slow I/O, lack of maintenance). I don’t say fixing this is not possible, I’m just saying that I feel we don’t have the maintenance momentum to do this right now. But I’m happy to be proven wrong :-) 

Regarding Netmap and other Solo5 extensions, I’m interested to hear what went ok and what went wrong? Why wasn’t this merged? Takayuki - do you think the Netmap approach was the right way to go there? How does Netmap perform in a virtualized environment? I think people also discussed using a DPDK-eBPF based IO at one point. How does it fit? (I haven’t followed closely what was happening in this space). A good topic to discuss during our next Mirage call :-)

and there is no support for SMP; for most use cases, it is not an issue, but for others we are looking at, it can be.
There has been in the early days some fork from solo5 called hermitcore that added SMP. I'm curious why you didn't pick that up for your mail.

I didn’t know HermitCore was based on solo5? Nowadays they compile Rust to Unikraft.

The other one is that the device model is very simple (for good reasons) and challenging to extend to new devices (see below for more detail). In an ideal world, this could be fixable, but there are also very few courageous active maintainers, so any changes - like moving to OCaml5 - are complex to make.
Hmm, one thing clearly is solo5 lacking maintainers and contributors. The other thing is that ocaml-solo5 with its minimal (no)libc surely needs adaption for the OCaml5 runtime rewrite (see the PR that has been around for ages). Now, when you switch to unikraft, I'm not entirely sure what your tradeoff is? Does unikraft support SMP? Did you evaluate in detail the trusted code base differences between solo5 and unikraft?

There are two timeline here. On the short-term: I fully agree we should start by moving solo5 to single-core OCaml5 - that PR has been lingering for too long. I’ll try to see if some people familiar with the OCaml 5 runtime could review this early next year.

And on the medium/longer term, I would like to explore alternative options to solo5 (and maybe come back to solo5 if the options are not great).

So far Unikraft has demonstrated a nice momentum, with lots of maintainers (and lots of quality contributions). Wealso have funding to explore the integration with the unikraft team (the Grant we got accepted was in collaboration with UPB where Unikfraft is developed). And yes, there is support for SMP (this is pretty recent, so unclear how stable this is) - for instance there is work happening here: https://github.com/unikraft/lib-pthread-embedded

I haven’t looked at their codebase directly yet, but I’ve heard lots of good comments regarding the general quality and robustness of their C code. However I’ve also heard that their current focus is on portability and performance. The security roadmap is progressing (https://unikraft.org/docs/concepts/security#unikraft-security-features /  https://github.com/orgs/unikraft/projects/32/views/1) but again unclear what is the ETA and quality. I would expect this to be part of the evaluation if Unikraft is a good fit or not for Mirage.

### Devices and Libraries
There are three areas that we would like to focus on (or continue to focus on) in the next couple of years.
First, we still believe there are better abstractions for storing data than POSIX. Hence, we are continuing to develop and improve Irmin. We are currently porting `irmin-pack` to MirageOS (the backend of Irmin used by the Tezos blockchain to store its ledger history) via the [Notafs](https://github.com/tarides/notafs) project. Notafs is a pseudo filesystem for Mirage block devices. It can handle a small number of large files. While the limited number of filenames is unsatisfying for general usage, it can be used to run the irmin-pack backend of Irmin, which only requires a dozen huge files. By running Irmin, one gets for free a filesystem-on-steroid for MirageOS: it supports an arbitrarily large number of filenames; is optimised for small and large file contents; performs file deduplication; includes a git-like history with branching and merging, ... and it even provides a garbage collector to avoid running out of disk space (by deleting older commits). Since the Irmin filesystem is versioned by Merkle hashes, one can imagine deploying reproducible unikernels on reproducible filesystem states!
Makes me curious what you try to achieve with it. A "reproducible filesystem" means what exactly? What is the difference to a git repository of your "irmin-pack" (so why use irmin/notafs instead of a git repository)? How do you get a robust file system without "fsync"? What is the performance of notafs compared to a git repository?

Irmin and Git have the same general data model - they both use a Merkel graph of objects. There are a few differences though:

- Git support SHA1 only (for now - although ocaml-git is functorised over the hash implementation, if you want to interface with actual Git repositories - like storing you data on GitHub - you don’t have much choice). Irmin-pack can use whatever hashes - Tezos uses BLAKE2b for instance.
- Git has limited support for large directories as he space and speed performance of traversing a directory is linear in the number of files/sub-directories. This is problematic when you start updating files in these large subdirectories, as every write will duplicate the node and cause excessive space usage. Irmin-pack use something that looks like (deterministic and well distributed) inodes to represent directories so the space and speed complexity is logarithmic. 
- Both solutions have limited support for large files. But if you are storing a large file in Git you are screwed. While with Irmin you can switch to an alternate way to represent large blobs (for instance using a rope-like data structure)
- The storage strategy is also a bit different:
    - Git has an interesting storage model: it has a “minor heap” (with recent objects that are stored uncompressed) and a “major heap” (with optimized pack files that stores compressed objects). Running a GC will compact the minor heap into a new pack file. This is a “stop-the-world” operation and so you can’t read or write new objects concurrently. The GC can also trigger a repack, that will compact the major heap (and unpack / repack the existing pack files by removing unreachable objects). This is again “stop-the-world” and very costly I/O operation.
    - Irmin-pack has just one heap with a concurrent GC - you can continue to read and write efficiently in the store while the GC is running in the background, and the GC is efficient enough that it is actually not noticeable. This works very well if your history is mostly linear and if you want to keep the last X commits and discard the rest. If you want a different GC strategy, this won’t work so well.
- The in-memory caching story is also different:
   - Git (and ocaml-git) has no notion of read and write cache - every operation directly goes through the store. You can decide to have an in-memory or on-disk Git store, but doing a bit of both is complicated.
   - Irmin has an in-memory cache (Irmin.Tree) that lazily read objects and cache write object until the next commit. This is a great way to batch write operations to avoid writing garbage on disk.
- tangiently related, but we have ongoing experiments to parallelized and have a direct-style irmin-pack using eio that are quite promising as that data model scales very well (I suspect that’s also the case for Git, but concurrent/parallel writes on the file system for the “minor heap” might not scale as well) — we’re planning to talk more about this early next year when https://github.com/mirage/irmin/tree/eio got merged.

And I’ll let the notafs authors answer about the performance where they are back from holidays :-)

Regarding fsync:
- Irmin-pack always store consistent data on disk - so if you computer crashes in the middle of some operations, you are supposed to be able to restart — with maybe an outdated version of your file but at least a consistent one. So at least we should have consistency (if not, that’s a bug).
- Durability is harder as it’s pretty unclear what is the actual semantics for block device synchronization - and I don’t think Mirage (and virtio and the Linux kernel(s)) implement this semantic consistently. Maybe that’s a good time to resurrect the write barrier/durable patches in https://github.com/mirage/mirage-block/compare/main...g2p:mirage-block:barrier-and-discard but we need to have an idea on how all the existing (virtual) block device implementation are supposed to behave? Happy to hear what are people opinions here.

Best,
Thomas

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.