[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Thoughts on cloud control APIs for Mirage
I've got the 'Life and times of a Zookeeper' paper on my iPad for reading next, in fact! :-) In Plan 9, the data for streams were also handled through files as well as the coordination and control, whereas Zookeeper is very much a centralised name service that runs distributed. I wonder if there may be a useful middle ground between the name service (distributed with a central controller and leadership) and the local filesystem views themselves (which seem more convenient for a library to manipulate, and could be hardwired easily if the system is not distributed). -anil On 17 Oct 2011, at 08:41, Steven Hand wrote: > Funnily enough, the *original* model for XenStore was distributed, and was > inspired by Plan 9 name spaces. > > Nowadays I'd recommend looking into Zookeeper for a more interesting kind of > coordination space... > > Cheers, > > S. > > > -----Original Message----- > From: cl-mirage-bounces@xxxxxxxxxxxxxxx > [mailto:cl-mirage-bounces@xxxxxxxxxxxxxxx] On Behalf Of Anil Madhavapeddy > Sent: Monday, October 17, 2011 1:04 PM > To: Thomas Gazagnaire > Cc: cl-mirage@xxxxxxxxxxxxxxx > Subject: Re: Thoughts on cloud control APIs for Mirage > > Right... Xenstore is 'almost there' but not quite. For example, it has > transactions and a globally shared namespace, whereas the Plan 9 model is to > give processes their own namespace and mount other services into that. > > So if you have an HTTP server domain, it might export a > /http/server/recoil.org directory, and clients wanting to read a URL can > import that filesystem somewhere into their system and read files under it. > A HTTP proxy could then serialise that file into actual HTTP and write it to > a /net/tcp/555/data to respond to an external request. This could all happen > within the same kernel, or across multiple domains. > > Xenstore will always require a globally privileged Xenstored to manage the > namespace, whereas the Plan 9 model is far better suited to multiple > intercommunicating processes (or stub domains). I'm just thinking through > the implications on consistency models across a cluster of physical hosts at > the moment though... > > Anil > > On 17 Oct 2011, at 07:57, Thomas Gazagnaire wrote: > >> Basically you say we need Xenstore :-) >> >> putting the plan9 paper on my to-read list. >> >> -- >> Thomas >> >> On Oct 17, 2011, at 1:45 PM, Anil Madhavapeddy wrote: >> >>> Mirage now has a number of protocols implemented as libraries, as >>> well as device drivers. What's missing is an effective control stack to >>> glue all this together into a proper OS. So far, we are just wiring >>> together applications manually from the libraries, which is fine for >>> development but not for any real deployment. >>> >>> I've been re-reading the Plan 9 papers [1] for inspiration, and many of >>> the ideas there are highly applicable to us. To realise the Mirage goal of >>> synthesising microkernels that are 'minimal for purpose', we need to: >>> >>> - have multiple intercommunicating components, separated by process >>> boundaries (on UNIX) or VM isolation (on Xen), or simply a function >>> call compiled as part of the same kernel. >>> >>> - minimise information flow between components, so they can be >>> dynamically split up ('self scaling') or combined more easily. >>> >>> - deal with the full lifecycle of all these VMs and processes, and not >>> just spawning them. >>> >>> Plan 9 was built on very similar principles: instead of a big monolithic >>> kernel, the system is built on many processes that communicate via a >>> well-defined wire protocol (9P), and per-process namespaces and filesystem >>> abstractions for almost every service. For example, instead of 'ifconfig', >>> the network is simply exposed as a /net filesystem and configured through >>> filesystem calls rather than an alternative command line. Crucially, the >>> 9P protocol can be remotely called, or directly via a simple function call >>> (for direct in-kernel operations). >>> >>> In contrast, modern cloud stacks are just terribly designed: they consist >>> of a huge amount of static specification of VM and network state, with >>> little attention paid to simple UNIX/Plan9 principles that can be used to >>> build the more complicated abstractions. >>> >>> So, this leaves us with an interesting opportunity: to implement the >>> Mirage control interface using similar principles: >>> >>> - a per-deployment global hierarchial tree (i.e. a filesystem), with ways >>> to synchronise on entries (i.e. blocking I/O, or a select/poll >>> equivalent). Our consistency model may vary somewhat, as we could be >>> strongly consistent between VMs when running on the same physical host, >>> and more loose cluster-wide. >>> >>> - every library exposes a set of keys and values, as well as a mechanism >>> for session setup, authentication and teardown (the lifecycle of the >>> process. Plan 9 used ASCII for everything, whereas Mirage would layer >>> a well-typed API on top of it (e.g. just write a record to a file rather >>> than manually serialising it). >>> >>> - extend the Xen Cloud Platform to support delegation, so that microVMs >>> can be monitored or killed by supervisors. Unlike Plan9, this also >>> includes operations across physical hosts (e.g. live relocation), or >>> across cloud providers. >>> >>> There are some nice implications of this work that goes beyond Mirage: >>> >>> - it generally applies to all of the exokernel libraries out there, >>> including HalVM (Haskell) or GuestVM (Java), as they all have this >>> control problem that makes manpulating raw kernels such a pain to do. >>> >>> - it can easily be extended to support existing applications on a >>> monolithic guest kernel, and in make it easier to manage them too. >>> >>> - application synthesis becomes much more viable: this approach could let >>> me build a HTTP microkernel without a TCP stack, and simply receive a >>> typed RPC from a HTTP proxy (which has done all the work of parsing the >>> TCP and HTTP bits, so why repeat it?). If my HTTP server microkernel >>> later live migrates away, then it could swap back to a network connection. >>> >>> Modern cloudy applications (especialy Hadoop or CIEL) use HTTP very >>> heavily to talk between components, so optimising this part of the stack >>> is worthwhile (numbers needed!) >>> >>> - Even if components are compiled up in the same binary and use function >>> calls, they still have to establish and authenticate connections to each >>> others. This makes monitoring and scaling hugely easier, since the >>> control filesystem operations provide a natural logging and introspection >>> point, even for large clusters. If we had a hardware-capability-aware >>> CPU in the future, it could use this information too :-) >>> >>> I highly recommend that anyone interested in this area read the Plan 9 >>> paper, as it's a really good read anyway [1]. Also the Scout OS and >>> x-kernel stack are good. Our main difference from this work is the >>> heavy emphasis on type-safe components, as well as realistic deployment >>> due to the use of Xen cloud providers as a stable hardware interface. >>> >>> In the very short-term, Mort and I have an OpenFlow tutorial coming up in >>> mid-November, so I'll lash up the network stack to have a manual version >>> of this as soon as possible, so that you can configure all the tap >>> interfaces and such much more quickly. Meanwhile, all and any thoughts >>> are most welcome! >>> >>> [1] Plan 9 papers: http://cm.bell-labs.com/sys/doc/ >>> >>> -- >>> Anil Madhavapeddy http://anil.recoil.org >>> >> >> > > > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 2012.0.1831 / Virus Database: 2090/4557 - Release Date: 10/17/11 >
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |