Xen project Mailing List

RE: Thoughts on cloud control APIs for Mirage

To: Anil Madhavapeddy <anil@xxxxxxxxxx>, Thomas Gazagnaire <thomas.gazagnaire@xxxxxxxxx>

From: Steven Hand <Steven.Hand@xxxxxxxxxxxx>

Date: Mon, 17 Oct 2011 13:41:49 +0100

Accept-language: en-US, en-GB

Acceptlanguage: en-US, en-GB

Cc: "cl-mirage@xxxxxxxxxxxxxxx" <cl-mirage@xxxxxxxxxxxxxxx>

List-id: MirageOS development <cl-mirage.lists.cam.ac.uk>

Thread-index: AcyMxN3AtSyTOI6WR825OKKq5GEW+AABRmAg

Thread-topic: Thoughts on cloud control APIs for Mirage

Funnily enough, the *original* model for XenStore was distributed, and was inspired by Plan 9 name spaces. Nowadays I'd recommend looking into Zookeeper for a more interesting kind of coordination space... Cheers, S. -----Original Message----- From: cl-mirage-bounces@xxxxxxxxxxxxxxx [mailto:cl-mirage-bounces@xxxxxxxxxxxxxxx] On Behalf Of Anil Madhavapeddy Sent: Monday, October 17, 2011 1:04 PM To: Thomas Gazagnaire Cc: cl-mirage@xxxxxxxxxxxxxxx Subject: Re: Thoughts on cloud control APIs for Mirage Right... Xenstore is 'almost there' but not quite. For example, it has transactions and a globally shared namespace, whereas the Plan 9 model is to give processes their own namespace and mount other services into that. So if you have an HTTP server domain, it might export a /http/server/recoil.org directory, and clients wanting to read a URL can import that filesystem somewhere into their system and read files under it. A HTTP proxy could then serialise that file into actual HTTP and write it to a /net/tcp/555/data to respond to an external request. This could all happen within the same kernel, or across multiple domains. Xenstore will always require a globally privileged Xenstored to manage the namespace, whereas the Plan 9 model is far better suited to multiple intercommunicating processes (or stub domains). I'm just thinking through the implications on consistency models across a cluster of physical hosts at the moment though... Anil On 17 Oct 2011, at 07:57, Thomas Gazagnaire wrote: > Basically you say we need Xenstore :-) > > putting the plan9 paper on my to-read list. > > -- > Thomas > > On Oct 17, 2011, at 1:45 PM, Anil Madhavapeddy wrote: > >> Mirage now has a number of protocols implemented as libraries, as >> well as device drivers. What's missing is an effective control stack to >> glue all this together into a proper OS. So far, we are just wiring >> together applications manually from the libraries, which is fine for >> development but not for any real deployment. >> >> I've been re-reading the Plan 9 papers [1] for inspiration, and many of >> the ideas there are highly applicable to us. To realise the Mirage goal of >> synthesising microkernels that are 'minimal for purpose', we need to: >> >> - have multiple intercommunicating components, separated by process >> boundaries (on UNIX) or VM isolation (on Xen), or simply a function >> call compiled as part of the same kernel. >> >> - minimise information flow between components, so they can be >> dynamically split up ('self scaling') or combined more easily. >> >> - deal with the full lifecycle of all these VMs and processes, and not >> just spawning them. >> >> Plan 9 was built on very similar principles: instead of a big monolithic >> kernel, the system is built on many processes that communicate via a >> well-defined wire protocol (9P), and per-process namespaces and filesystem >> abstractions for almost every service. For example, instead of 'ifconfig', >> the network is simply exposed as a /net filesystem and configured through >> filesystem calls rather than an alternative command line. Crucially, the >> 9P protocol can be remotely called, or directly via a simple function call >> (for direct in-kernel operations). >> >> In contrast, modern cloud stacks are just terribly designed: they consist >> of a huge amount of static specification of VM and network state, with >> little attention paid to simple UNIX/Plan9 principles that can be used to >> build the more complicated abstractions. >> >> So, this leaves us with an interesting opportunity: to implement the >> Mirage control interface using similar principles: >> >> - a per-deployment global hierarchial tree (i.e. a filesystem), with ways >> to synchronise on entries (i.e. blocking I/O, or a select/poll >> equivalent). Our consistency model may vary somewhat, as we could be >> strongly consistent between VMs when running on the same physical host, >> and more loose cluster-wide. >> >> - every library exposes a set of keys and values, as well as a mechanism >> for session setup, authentication and teardown (the lifecycle of the >> process. Plan 9 used ASCII for everything, whereas Mirage would layer >> a well-typed API on top of it (e.g. just write a record to a file rather >> than manually serialising it). >> >> - extend the Xen Cloud Platform to support delegation, so that microVMs >> can be monitored or killed by supervisors. Unlike Plan9, this also >> includes operations across physical hosts (e.g. live relocation), or >> across cloud providers. >> >> There are some nice implications of this work that goes beyond Mirage: >> >> - it generally applies to all of the exokernel libraries out there, >> including HalVM (Haskell) or GuestVM (Java), as they all have this >> control problem that makes manpulating raw kernels such a pain to do. >> >> - it can easily be extended to support existing applications on a >> monolithic guest kernel, and in make it easier to manage them too. >> >> - application synthesis becomes much more viable: this approach could let >> me build a HTTP microkernel without a TCP stack, and simply receive a >> typed RPC from a HTTP proxy (which has done all the work of parsing the >> TCP and HTTP bits, so why repeat it?). If my HTTP server microkernel >> later live migrates away, then it could swap back to a network connection. >> >> Modern cloudy applications (especialy Hadoop or CIEL) use HTTP very >> heavily to talk between components, so optimising this part of the stack >> is worthwhile (numbers needed!) >> >> - Even if components are compiled up in the same binary and use function >> calls, they still have to establish and authenticate connections to each >> others. This makes monitoring and scaling hugely easier, since the >> control filesystem operations provide a natural logging and introspection >> point, even for large clusters. If we had a hardware-capability-aware >> CPU in the future, it could use this information too :-) >> >> I highly recommend that anyone interested in this area read the Plan 9 >> paper, as it's a really good read anyway [1]. Also the Scout OS and >> x-kernel stack are good. Our main difference from this work is the >> heavy emphasis on type-safe components, as well as realistic deployment >> due to the use of Xen cloud providers as a stable hardware interface. >> >> In the very short-term, Mort and I have an OpenFlow tutorial coming up in >> mid-November, so I'll lash up the network stack to have a manual version >> of this as soon as possible, so that you can configure all the tap >> interfaces and such much more quickly. Meanwhile, all and any thoughts >> are most welcome! >> >> [1] Plan 9 papers: http://cm.bell-labs.com/sys/doc/ >> >> -- >> Anil Madhavapeddy http://anil.recoil.org >> > > ----- No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.1831 / Virus Database: 2090/4557 - Release Date: 10/17/11

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.