Xen project Mailing List

Re: [MirageOS-devel] Error handling in Mirage - request for comments!

From: Thomas Leonard <talex5@xxxxxxxxx>

Date: Wed, 4 Feb 2015 14:09:29 +0000

Cc: "mirageos-devel@xxxxxxxxxxxxxxxxxxxx" <mirageos-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 04 Feb 2015 14:09:33 +0000

List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

On 4 February 2015 at 13:01, Leo White <lpw25@xxxxxxxxx> wrote: > Hi all, > > As this discussion keeps on running, I thought I would add a couple of > thoughts: > > - The problem with using exceptions for returning errors is that it > becomes very difficult to distinguish between a dynamic error that is > part of normal use (e.g. this network connection failed) from a > programming error (e.g. Not_found accidently leaked from a use of > List.find, an assert false triggering). This makes it hard to produce > the correct behaviour in all cases: there are applications for which > it would be better to kill everyting on detecting a programmer error > rather than risk continuing in an unstable state. I don't think exception vs error code is a reliable way to divide these cases up. Having your FS return `Block_error indicates a serious problem that might require terminating the unikernel, whereas getting a Division_by_zero exception from your HTTP handler is likely fairly harmless. Whether an exception/error is serious depends more on the importance of the thing that raised it. Consider the case of a filesystem that reads a corrupted disk and throws an exception (e.g. an assert fails). This is probably the most extreme case where you'd want to abort. Should it terminate the unikernel? It depends what the disk is being used for. If it's the main hard-disk then possibly. If it's some removable media the user has just inserted then certainly not. A good principle here is that a broken component should only be able to harm itself. If a filesystem fails to handle a corrupted disk correctly then it may further corrupt that disk, but it should not abort the unikernel (and thus possibly corrupt other disks in the middle of being written). In this case, we can imagine a fail-safe FS functor that wraps all the calls in the FS API so that if any one of them throws an unexpected exception then it unplugs the underlying block device. No need to kill everything. > - The problem with using Lwt.t as your error monad is that it becomes > difficult to distiunguish between synchronous things that may return > errors, asynchronous things that may return errors, and asynchornous > things that should not return errors. It also seems tied up with the > exception mechanism, which leads to the same problem as my previous > point. > > Personally, I would probably suggest that all Mirage modules/module > types include in their signatures: > > type error > > val pp_error : formatter -> error -> string > > For cases where an error can reasonably be matched on and handled > specially, this should be exposed in the signature: > > type error = private [> `Foo of foo | `Bar of bar] Aha! I knew you'd know a trick to make this work! However, different functions return different sets of errors. For example, BLOCK.read shouldn't return `Is_read_only. Can we handle that? > val pp_error : formatter -> error -> string I've added some error_message (error -> string) functions, but this might be better. I haven't used formatters much, so don't have an opinion here. But I think we still need an additional "exn_of_error" here because whether something is a "dynamic" (expected) error or a bug changes as the error is propagated. For example, an XML parser probably regards a malformed document as a dynamic error (`Malformed of malformed), which its caller may want to handle. But if the caller is trying to load its configuration file from a crunch FS, then malformed XML is a programming bug and should be thrown as an exception. > Some nice combinators should be provided for using ('a, 'b) Result.t and > ('a, 'b) Result.t Lwt. and for lifting an ('a, Foo.error) Result.t into > an ('a, Bar.error) Result.t. > > Exceptions that escape their intended scope, should always be treated as > a programming error. The various "finally" functions for resources > should catch and reraise them, so that they can reach the outermost > scope which knows how best to deal with a programming error in the > particular application. The same goes for binding on an Lwt thread which > raised an exception or failed: it should cause an exception to be raised > to reach the outermost level. Yes, I think it does this. > The aim here is still to take the erlang-style kill the component that > failed and try again approach, but to ensure that there are two distinct > return paths for regular errors and programming errors. The secondary > aim is to have module signatures which give a clear indication of > intended use. > > Regards, > > Leo -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.