[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Updates on Tarides Plans with MirageOS - Request for Feedback
Hi, thanks for your elaborate mail. I have some comments inline below. On 15/12/2023 19:18, Thomas Gazagnaire wrote: ### Tooling Over the past five years, our efforts have focused on integrating Mirage-specific tooling into the OCaml Platform. We plan to continue in this direction. This integration is intended to benefit both Mirage developers (by reducing the maintenance burden on the core MirageOS team) and the broader OCaml user base (as they could benefit from MirageOS workflow -- especially cross-compilation -- in other situations). A significant part of this effort was transitioning from custom x-compilation runes to using Dune workspace via `opam -monorepo`. This migration was not always painless (to say the least), but we learned a few things that are now being applied to the new "package management" feature of Dune 4. Thus, we plan to continue to work on migrating from `opam-monorepo` to the `dune pkg` command to ensure it works for MirageOS users. This new command addresses the limitations identified in opam-monorepo, especially for packages not built with Dune. An alpha version is currently available (try `dune pkg` with Dune 3.12), and we anticipate a complete release in Q1 2024. We really want to ensure this is a drop-in replacement for Mirage's use of `opam-monorepo`, so we will work with upstream to ensure that is the case (and so we can deprecate opam-monorepo in Q2 2024). I'm looking forward to remove opam-monorepo from the chain - but at the same time I'm worried that "dune pkg" will just have different bugs. I hope you managed to use the issue reported at opam-monorepo to construct a test suite on how "dune pkg" should not fail. Note that in opam-monorepo there were quite some issues with specific versions of the opam-repository (surely all of it is retrievable and reconstructible). I am a bit worried, since the mirage3 -> mirage4 change (change of compilation strategy) had quite some impact on our reproducible build system (https://builds.robur.coop, https://github.com/robur-coop/orb) - now when opam-monorepo will be deprecated in favour of "dune pkg", my worry is that there's again quite some work needed on our end. It turns out, only mirage 4.2 was in a shape where we could use it for reproducible builds. But I'm sure you're aware of the issues that we encountered and pushed upstream to e.g. mirage, and how all the bits (mirage/opam/orb/opam-monorepo) currently fit together, and how "dune pkg" will fit in. Regarding mirage/functoria, my general feeling is that while the CLI tool was initially valuable for gathering an ecosystem of libraries, nowadays, it is less clear if this is still required. Right now, most of the tool's complexity is handling the installation of packages needed for a specific target/combination of devices. This will no longer be needed if the build system can do this instead. Ideally, any OCaml application (following a few design principles) could be compiled to a unikernel simply using Dune, as envisioned by the [Workflow W12](https://ocaml.org/docs/platform-roadmap#w12-compile-to-mirageos) of the OCaml Roadmap. However, there is no existing design on how this should work at this stage. So, before starting this, is that the right direction for the mirage tool? My experience with Mirage - the tool - is that it does various things at once: - figuring out dependencies and requirements of MirageOS devices, putting them into a boot order (what is generated as main.ml) - selecting target-specific implementations (still, I thought someone wanted to revise this to use "dune variants", but haven't seen any demonstration thereof) - command line arguments at configure and boot time (here, there has been various recent discussions, https://github.com/mirage/mirage/issues/1422, and a huge amount of reshuffling and implementation work by yourself -- which unfortunately doesn't seem to be ready yet (merged onto the main branch, but take a look at the regressions https://github.com/mirage/mirage/issues/1479 https://github.com/mirage/mirage/issues/1483), and looks slightly abandoned - cross-compiling/linking (using ocaml-solo5 with solo5 cross-compilation shell-script, and opam-monorepo to construct a monorepo) Out of these 4 items, I'm not sure what "dune -x mirage" will attempt to solve. My goal is to make mirage - the tool - less smart about what it attempts to achieve, but I don't think that moving these bits into dune would be beneficial. ### Targets The principal target backend for MirageOS nowadays is Solo5. This is a solid backend, which has been audited and optimised for security. It is also relatively simple to add new devices given the by-design low-complexity approach of its device model. However, while solo5 is today the most secure unikernel "runtime", I also feel it has issues hindering potential changes. For one, it is slow -- the device model is not meant for high-speed I/O, There have been contributions and attempts to implement that - see https://github.com/solo5-netmap/solo5/tree/netmap as example. I don't quite get the "has issues hindering potential changes". and there is no support for SMP; for most use cases, it is not an issue, but for others we are looking at, it can be. There has been in the early days some fork from solo5 called hermitcore that added SMP. I'm curious why you didn't pick that up for your mail. The other one is that the device model is very simple (for good reasons) and challenging to extend to new devices (see below for more detail). In an ideal world, this could be fixable, but there are also very few courageous active maintainers, so any changes - like moving to OCaml5 - are complex to make. Hmm, one thing clearly is solo5 lacking maintainers and contributors. The other thing is that ocaml-solo5 with its minimal (no)libc surely needs adaption for the OCaml5 runtime rewrite (see the PR that has been around for ages). Now, when you switch to unikraft, I'm not entirely sure what your tradeoff is? Does unikraft support SMP? Did you evaluate in detail the trusted code base differences between solo5 and unikraft? ### Devices and Libraries There are three areas that we would like to focus on (or continue to focus on) in the next couple of years. First, we still believe there are better abstractions for storing data than POSIX. Hence, we are continuing to develop and improve Irmin. We are currently porting `irmin-pack` to MirageOS (the backend of Irmin used by the Tezos blockchain to store its ledger history) via the [Notafs](https://github.com/tarides/notafs) project. Notafs is a pseudo filesystem for Mirage block devices. It can handle a small number of large files. While the limited number of filenames is unsatisfying for general usage, it can be used to run the irmin-pack backend of Irmin, which only requires a dozen huge files. By running Irmin, one gets for free a filesystem-on-steroid for MirageOS: it supports an arbitrarily large number of filenames; is optimised for small and large file contents; performs file deduplication; includes a git-like history with branching and merging, ... and it even provides a garbage collector to avoid running out of disk space (by deleting older commits). Since the Irmin filesystem is versioned by Merkle hashes, one can imagine deploying reproducible unikernels on reproducible filesystem states! Makes me curious what you try to achieve with it. A "reproducible filesystem" means what exactly? What is the difference to a git repository of your "irmin-pack" (so why use irmin/notafs instead of a git repository)? How do you get a robust file system without "fsync"? What is the performance of notafs compared to a git repository? Best, Hannes
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |