[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] RFC: disconnect / DEVICE lifetime



On 10 October 2016 at 12:41, Hannes Mehnert <hannes@xxxxxxxxxxx> wrote:
> Hello,
>
> On 30/09/2016 18:13, Hannes Mehnert wrote:
>> So far, connect returns [ `Ok of 'a | `Error of 'b ] Lwt.t, and in the
>> error case functoria emits a "fail (Failure <devicename>)".  This is
>> clearly not very informative to potential users (an example being
>> "Failure net11").
>
> Thanks for everybody involved getting this change through.
>
> I thought a bit more about the `DEVICE` module type.  I'm curious what
> `disconnect : t -> unit io` is supposed to do, and who is supposed to
> call it?  Its counterpart `connect` is called by auto-generated code
> (from mirage) in `main.ml` during startup.  Should mirage as well
> generate a `shutdown` function and invoke it after `start` finished?
> This could ensure that disconnect is called in the reverse order of
> connect, and once only.
>
> My intuition about the contract of disconnect is: cleaning up state,
> freeing resources (e.g. sending consumers messages about terminating
> (TCP? TLS?) sessions, stopping Lwt listener threads, etc.)!?  Initially
> [0] it was not intended to fully clean up resources!?
>
> This leads to the lifetime of a DEVICE instance:  they can be demanded
> by a unikernel (during configuration/build time), and must be present
> (and in working operation) during runtime.  (In some future OCaml it'd
> be nice to have linear types and be able to express that disconnect
> consumes the given t.  But we're not there yet.)
> There's at no time any need to dynamically allocate a DEVICE and use it
> temporarily within the lifetime of a unikernel (please tell me if I'm
> wrong and you've scenarios in mind where you want to spawn DEVICEs
> dynamically [*]).

mirage-firewall starts new network devices as new client VMs turn up.

> This also means that a user should never call disconnect on any DEVICE
> where main.ml called connect (and there are no other DEVICEs).

Here's one possibility:

Add a `~switch` argument to the `connect` functions and have devices
disconnect when the switch is turned off. main.ml can pass a single
switch to everything it starts, and turn it off at the end. Remove the
`disconnect` function from DEVICE.

The mirage-net-xen netback driver already works this way internally,
as a reliable way to clean up resources if the connect step fails at
any point:

https://github.com/mirage/mirage-net-xen/blob/c2f4f73be0613222468d19b852d69226cde78cbf/lib/backend.ml#L108

and also to clean up everything when the client disconnects:

https://github.com/mirage/mirage-net-xen/blob/c2f4f73be0613222468d19b852d69226cde78cbf/lib/backend.ml#L105

It also makes it clear that it's the caller of connect who is
responsible for disconnecting, not the user of the device.

> This leads to the question what a disconnected DEVICE should do?  I'd be
> in favour of undefined behaviour here, because an alternative (are there
> better ones?) would be to check in every DEVICE operation that the
> DEVICE state is not disconnected (as done currently in
> mirage-block-unix, see [1] for what I have in mind), which imho
> convolutes the control flow immensly.

In general, you only need to check at the lowest level (e.g. writing
to a ring buffer or notifying an event channel), where failing to do
this would lead to memory corruption or similar. Higher levels (e.g.
FAT) shouldn't need to check, as an exception will be raised
automatically when they try to access their underlying device.

In any case, this is a programming error and so should be reported as
an exception.

mirage-net-xen had to check manually because currently you can't
disconnect a shared ring, I think. If shared-memory-ring provided a
reliable way to disconnect, mirage-net-xen wouldn't have to do its own
checks.

> TL;DR: emit disconnect chain in functoria/mirage, remove checks from
> mirage-block-unix [tentative 1], remove more code ;)

You can't remove the check from mirage-block-unix because OCaml's file
descriptors are unsafe (e.g. closing the same FD twice may close a
different FD).

> Hannes
>
> 0:
> https://github.com/mirage/mirage-types/commit/62b18ac0cee40895f45173f196fa6de58a56c076
> 1:
> https://github.com/hannesm/mirage-block-unix/commit/f3de7fa7d3224d3c47a0e808e5e7cf650cfbf050
> *: I thought about a DEVICE providing secrets only needed at startup
> time, and then should be disconnected and destroyed.  But let's be fair,
> in OCaml we cannot (atm) safely erase data from memory (due to GC it
> might have moved, zeroing out is also not possible afaics).  Also, OCaml
> already has this strong and safe type system, thus we shouldn't need
> further defense mechanism in this scenario.
>
> _______________________________________________
> MirageOS-devel mailing list
> MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel



-- 
talex5 (GitHub/Twitter)        http://roscidus.com/blog/
GPG: 5DD5 8D70 899C 454A 966D  6A51 7513 3C8F 94F6 E0CC
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.