[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Some thoughts on operating unikernel environments



On 22 August 2015 at 18:37, Gareth Rushgrove <gareth@xxxxxxxxxxxxxxxxx> wrote:
> On 22 August 2015 at 12:34, Thomas Leonard <talex5@xxxxxxxxx> wrote:
>> On 21 August 2015 at 17:07, Gareth Rushgrove <gareth@xxxxxxxxxxxxxxxxx> 
>> wrote:
>>> I'd managed to get a bunch of throughts out of how head an into blog
>>> post form, on the theme of operating unikernels.
>>>
>>> The general gist is, assuming unikernels are awesome, how do we build
>>> and run production systems based on them?
>>>
>>> http://www.morethanseven.net/2015/08/21/operating-unikernel-challenges/
>>>
>>> This is mainly a list of problems, I'd love to heard from anyone who
>>> has done any hard thinking on any of them or cut any tools in this
>>> space.
>>
>> Hi Gareth,
>>
>> A few thoughts:
>>
>
> Thanks for replying.
>
>>
>> "How do I compose several unikernels together to build an application?"
>>
>> I think you answer this later, in the Orchestration section: the same
>> way we do with other VMs/containers - using Docker Compose, Ubuntu
>> Juju, etc. I haven't built anything big enough to need this yet
>> though.
>>
>
> That's my view as well (CloudFoundry or Kubernetes model would appear
> to work?) but I've not seen anyone doing this yet. Which probably
> means gaps exist when you actually try :) If anyone takes a run at
> this I'd certainly be interested, I'm guessing Lattice
> [http://lattice.cf/] might be a nice place to start?

Me too. Since Mirage services can also be compiled as Unix binaries it
should be possible to test deployment configuration using existing
systems right now. Then, it's "just" a matter of teaching the
deployment system to deploy unikernel VMs directly, rather than
deploying Linux VMs containing the service.

>> What does a Continuous integration or deployment pipeline look like?
>>
>> Amir gives an example in "Towards Heroku for Unikernels: Part 1 -
>> Automated deployment":
>>
>> http://amirchaudhry.com/heroku-for-unikernels-pt1/
>
> While an example of what's possible I don't think this is the highly
> opinionated high-level interface that would be required to make it
> easy to get started. Githooks, Makefiles and shell scripts are great
> for prototypes but don't tend to make for a great experience in my
> view. The skeleton is great, but only covers running unit tests and
> only on Travis. Test Kitchen [http://kitchen.ci/] is maybe a nice
> model to look at - as a thought experiment "what would Test Kitchen
> for Mirage look like?"

Amir, any thoughts on this? I don't see any reason why the deployment
scripts can't be made generic and packaged up.

>> "By removing the operating system we remove things like host firewalls ..."
>>
>> I see two main uses for firewalls. One is to avoid accidentally
>> exposing a host-only service (e.g. a database used by a web app in the
>> same VM) and the other is to provide basic access contol between VMs
>> (only the web VM can access the DB VM).
>>
>> For the first, two services in the same Mirage unikernel will
>> communicate directly using OCaml datatypes. When everything is a
>> library, using a network for internal communication would be crazy.
>
> At any degree of scale though you're going to be running many
> unikernels across many hosts - so some degree of network communication
> is going to be required (even if you minimise it with locality). Also,
> in most environments some of that integration is going to be with
> non-mirage/ocaml based systems and/or not running on the same
> hosts/datacenters.
>
>> Also, while Linux allows any process to listen on the network, Mirage
>> uses dependency injection so that only components that need network
>> access will be given it.
>>
>
> Yup, which is great. My thoughts were mainly about the second issue...
>
>> For the second, whatever is composing the services should configure
>> the network, in my opinion. In other words, if I say I want my web
>> server VM connected to a database VM, then nothing else should have
>> access to the DB VM.
>>
>> I would certainly like to see a higher-level API for networking, that
>> doesn't allow unexpected connections. e.g. we currently offer services
>> a low-level network API like:
>>
>>   val connect : network -> ipaddr -> port -> flow
>>   val listen : network -> port -> callback -> unit
>>
>> With this API, a library with network access can connect anywhere in
>> the world by supplying any IP address and port number, and must handle
>> its own encryption. A higher-level capability-style API could offer
>> something more abstract, e.g.
>>
>>   module type SturdyRef = sig
>>     type t
>>     val connect : t -> flow
>>   end
>>
>> Here, our web server would simply get a SturdyRef.t for the database,
>> and all it could do would be to connect to it.
>>
>
> Agreed. I just want something like this to exist :)
>
> I also think unikernels could make for really nice network devices
> (firewalls, security controls, proxies, etc.)

Yes. Here's a simple unikernel for a NAT device, for example:

  https://github.com/yomimono/simple-nat

> Lots of people are finding the network the limiting factor when they
> start down a microservices rabbit hole in my experience. How would
> unikernels work with some of the newer players in this space like
> Weave [http://weave.works/] or Calico [http://www.projectcalico.org/]
> might be interesting to consider?
>
>>
>> What does debugging a system based on unikernels look like?
>>
>> There's an example here: https://mirage.io/wiki/profiling
>>
>> "As a motivating example, we'll track down a (real, but now fixed) bug
>> in MirageOS's TCP stack."
>>
>
> From an operators point of view that's not really the same thing. The
> issues I see:
>
> * enabling it requires recompilation and redeployment (although you
> could probably put this behind some sort of feature flag?)

It can be enabled and disabled at run-time, but there's still a
performance cost to having this kind of very detailed tracing
available. I'd certainly like to see more support for general logging
and metrics (the kind of thing you keep on all the time).

On the other hand, I think you should be prepared to recompile and
redeploy your unikernels when needed, and that shouldn't be a big
deal. Trying to modify and redeploy a Linux kernel to get extra debug
is a nightmare, but with a unikernel it can be very easy.

> * it's not interactive

You can refresh the view while its running, so if you have something
you can tweak dynamically, you can see what effect it's having.

> I think the first is interesting, as the unikernel you're running
> might be provided by a third party vendor and you might not have the
> source code/right to modify/recompile. Or changes might required a
> lengthy change approval process.

Yes, for binary-only releases you have to compile any needed debug
code into it at all times (or provide a separate debug build).

> The second might be a matter of debugging at the hypervisor/xen layer
> but I've limited experience there. That also raises isolation issues -
> I probably want to limit access to the hypervisor more than to an
> individual application instance.
>
> I'm obviously mainly in critique mode with the post and points above.
> My main interest is in getting anyone thinking about operational
> problems early, in my view it's a pretty interesting set of issues for
> which good solutions undoutedly exist.

We need more experience reports here. In my case, all problems have been one of:

- Why did this take so long? (the disk driver didn't support large
requests and had to split them; the TCP stack set the retransmission
timeout too long)

- Why did this fail? (some exception details got ignored and replaced
by a generic error; I want to see the original)

- Why didn't this ever finish? (the ARP reply arrived before we
started waiting for it)

These questions can all be answered with the existing tracing. What
kind of interactive debugging would be helpful for you?

When anyone has a hard-to-diagnose problem, I'm interested to see how
Mirage's tracing or error reporting could be improved to make the
problem obvious.


-- 
Dr Thomas Leonard        http://roscidus.com/blog/
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.