Xen project Mailing List

Re: Thoughts on cloud control APIs for Mirage

To: Anil Madhavapeddy <anil@xxxxxxxxxx>

From: Thomas Gazagnaire <thomas.gazagnaire@xxxxxxxxx>

Date: Mon, 17 Oct 2011 13:57:18 +0200

List-id: MirageOS development <cl-mirage.lists.cam.ac.uk>

Basically you say we need Xenstore :-) putting the plan9 paper on my to-read list. -- Thomas On Oct 17, 2011, at 1:45 PM, Anil Madhavapeddy wrote: > Mirage now has a number of protocols implemented as libraries, as > well as device drivers. What's missing is an effective control stack to > glue all this together into a proper OS. So far, we are just wiring > together applications manually from the libraries, which is fine for > development but not for any real deployment. > > I've been re-reading the Plan 9 papers [1] for inspiration, and many of > the ideas there are highly applicable to us. To realise the Mirage goal of > synthesising microkernels that are 'minimal for purpose', we need to: > > - have multiple intercommunicating components, separated by process > boundaries (on UNIX) or VM isolation (on Xen), or simply a function > call compiled as part of the same kernel. > > - minimise information flow between components, so they can be > dynamically split up ('self scaling') or combined more easily. > > - deal with the full lifecycle of all these VMs and processes, and not > just spawning them. > > Plan 9 was built on very similar principles: instead of a big monolithic > kernel, the system is built on many processes that communicate via a > well-defined wire protocol (9P), and per-process namespaces and filesystem > abstractions for almost every service. For example, instead of 'ifconfig', > the network is simply exposed as a /net filesystem and configured through > filesystem calls rather than an alternative command line. Crucially, the > 9P protocol can be remotely called, or directly via a simple function call > (for direct in-kernel operations). > > In contrast, modern cloud stacks are just terribly designed: they consist > of a huge amount of static specification of VM and network state, with > little attention paid to simple UNIX/Plan9 principles that can be used to > build the more complicated abstractions. > > So, this leaves us with an interesting opportunity: to implement the > Mirage control interface using similar principles: > > - a per-deployment global hierarchial tree (i.e. a filesystem), with ways > to synchronise on entries (i.e. blocking I/O, or a select/poll > equivalent). Our consistency model may vary somewhat, as we could be > strongly consistent between VMs when running on the same physical host, > and more loose cluster-wide. > > - every library exposes a set of keys and values, as well as a mechanism > for session setup, authentication and teardown (the lifecycle of the > process. Plan 9 used ASCII for everything, whereas Mirage would layer > a well-typed API on top of it (e.g. just write a record to a file rather > than manually serialising it). > > - extend the Xen Cloud Platform to support delegation, so that microVMs > can be monitored or killed by supervisors. Unlike Plan9, this also > includes operations across physical hosts (e.g. live relocation), or > across cloud providers. > > There are some nice implications of this work that goes beyond Mirage: > > - it generally applies to all of the exokernel libraries out there, > including HalVM (Haskell) or GuestVM (Java), as they all have this > control problem that makes manpulating raw kernels such a pain to do. > > - it can easily be extended to support existing applications on a > monolithic guest kernel, and in make it easier to manage them too. > > - application synthesis becomes much more viable: this approach could let > me build a HTTP microkernel without a TCP stack, and simply receive a > typed RPC from a HTTP proxy (which has done all the work of parsing the > TCP and HTTP bits, so why repeat it?). If my HTTP server microkernel > later live migrates away, then it could swap back to a network connection. > > Modern cloudy applications (especialy Hadoop or CIEL) use HTTP very > heavily to talk between components, so optimising this part of the stack > is worthwhile (numbers needed!) > > - Even if components are compiled up in the same binary and use function > calls, they still have to establish and authenticate connections to each > others. This makes monitoring and scaling hugely easier, since the > control filesystem operations provide a natural logging and introspection > point, even for large clusters. If we had a hardware-capability-aware > CPU in the future, it could use this information too :-) > > I highly recommend that anyone interested in this area read the Plan 9 > paper, as it's a really good read anyway [1]. Also the Scout OS and > x-kernel stack are good. Our main difference from this work is the > heavy emphasis on type-safe components, as well as realistic deployment > due to the use of Xen cloud providers as a stable hardware interface. > > In the very short-term, Mort and I have an OpenFlow tutorial coming up in > mid-November, so I'll lash up the network stack to have a manual version > of this as soon as possible, so that you can configure all the tap > interfaces and such much more quickly. Meanwhile, all and any thoughts > are most welcome! > > [1] Plan 9 papers: http://cm.bell-labs.com/sys/doc/ > > -- > Anil Madhavapeddy http://anil.recoil.org >

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.