Xen project Mailing List

Re: [MirageOS-devel] Error handling in Mirage - request for comments!

To: Daniel BÃnzli <daniel.buenzli@xxxxxxxxxxxx>

From: Thomas Leonard <talex5@xxxxxxxxx>

Date: Fri, 30 Jan 2015 15:53:17 +0000

Cc: "mirageos-devel@xxxxxxxxxxxxxxxxxxxx" <mirageos-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 30 Jan 2015 15:53:25 +0000

List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

On 30 January 2015 at 14:36, Daniel BÃnzli <daniel.buenzli@xxxxxxxxxxxx> wrote: > "In theory, explicitly listing all possible errors and forcing callers to > consider them should result in more reliable code. In practice, it results in > callers ignoring all errors." > > This seems rather due to a lack of good, standarized, error handling > combinators in mirage itself. Can you propose a solution? Perhaps add it to the list of options? Every time I try, I find I end up duplicating Lwt's existing handling. > I'm really not sure that moving to more exceptions is a good thing especially > since last time I had a look at it lwt's error handling and cancelation > model/algebra it felt quite broken to me. What problems do you see in Lwt's error handling? I agree that Lwt cancellation seems broken*, but I'm not sure that's relevant here. > I would rather say that the system should not use exceptions at all except > for things that are not meant to be handled where the only action you take is > to basically to abort the unikernel. I'd agree if that was "abort the current transaction (which may cover the whole unikernel)". Realistically, there are always going to be error conditions that result in exceptions that should not terminate it (e.g. running out of memory serving requests should only abort some requests, etc). > In practice things like: > "Assume all code can raise exceptions, and that this must not break > invariants or leak resources" > > end up being a huge pain to code with. I probably shouldn't have listed this under "Proposal". I wasn't intending to suggest any change here, but just to state the fact that OCaml is a language with exceptions. It's interesting to see that two people disagree with this already. If I understand your position: - Every exception raised MUST terminate the unikernel. This includes out-of-memory, division-by-zero, int_of_string on an out-of-range int, etc, in any code path. Aborting the operation (e.g. HTTP request) that caused the problem, logging the exception and continuing is not an option. - It is therefore acceptable for a module to leak resources and/or leave the system in an invalid state if it it receives an exception from any code it calls. While some people might want that, it seems unfair to force everyone into this model. I don't think exception safety is much work in most cases (I always try to do this in my own code, anyway). > While it may make you feel the code is less involved in practice it is much > more harder to reason about and to code against it correctly compared to > using a well principled error monad. The latter will also be future proof to > any concurrency mechanism you may want to use in the future. Isn't Lwt already an error monad? Can you define "well principled" here? > Of course there may be a few cases were you want to be able to raise a > globally defined exception that is not meant to be handled like Out_of_memory > or in the block example you give. But I'm sure that you can count them on the > finger of one hand and predefine them somewhere in mirage. Let's make this concrete. The ones currently in V1 that I'd use with Lwt.return (rather than fail) would be: NETWORK, ETHIF: - `Disconnected TCP: - `Timeout, `Refused FS: - `Not_a_directory, `Is_a_directory, `Directory_not_empty, `No_directory_entry, `File_already_exists, `No_space KV_RO: - Unknown_key I would however replace all the network codes with a generic (`Network_error of exn), where the exn might be e.g. a Refused exception with more information about why it was refused. This makes it easy for callers who don't care to handle them all at once (with Lwt.fail), allows extra network errors to be added by implementations, and allows attaching more details about the causes. Likewise for the FS errors (`FS_user_error of exn). The ones I'd use with Lwt.fail (i.e. turn into exceptions) would be: FLOW: (error is abstract anyway) BLOCK: - `Unknown, `Unimplemented, `Is_read_only, `Disconnected NETWORK, ETHIF, IP: - `Unknown, `Unimplemented UDP, TCP, STACKV4: - `Unknown FS: - `Unknown_error, `Block_device Since 'connect' is going away, those errors wouldn't be included anyway, but I'd probably turn them into exceptions (Lwt.fail) too: ENTROPY: `No_entropy_device CONSOLE: `Invalid_console FS: `Format_not_recognised Do you disagree with these choices? In general, I want to use Lwt.fail for anything the caller won't want to handle specially. I'm undecided about BLOCK's `Is_read_only and `Disconnected. They would make sense for removable media. I just realised my text "Convert all modules to raise exceptions rather than return error codes" sounds like I want to remove all error codes. What I meant was, convert the all error codes we decide to convert, as above. Also, there are some exceptions I think should be error codes under this scheme, such as ring's Shutdown (which always needs to be handled because you need to resubmit to the new ring). Regards, * In particular, Lwt cancellation is simply ignored if the current operation happens to be uncancellable. Also, allowing consumers to cancel threads that other consumers may be waiting on seems wrong. -- Dr Thomas Leonard http://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.