Xen project Mailing List

Re: [MirageOS-devel] Error handling in Mirage

From: Anil Madhavapeddy <anil@xxxxxxxxxx>

Date: Thu, 10 Jul 2014 11:47:04 +0100

Cc: David Scott <Dave.Scott@xxxxxxxxxx>, "mirageos-devel@xxxxxxxxxxxxxxxxxxxx" <mirageos-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 10 Jul 2014 10:47:16 +0000

List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

On 10 Jul 2014, at 10:01, Thomas Leonard <talex5@xxxxxxxxx> wrote: > > I found it awkward too. What's the rationale behind Mirage's current > error handling? The basic problem with using exceptions is if they are constructed in a functor, they have a different representation across applications of that functor. Consider: exception Catch_me of string module A(S:sig val name : string end) = struct exception Catch_me of string let boom () = raise (Catch_me S.name) end module X = A(struct let name = "X" end) module Y = A(struct let name = "Y" end) let _ = try Printexc.print X.boom () with _ -> (); try Printexc.print Y.boom () with _ -> (); try X.boom () with |X.Catch_me x -> print_endline "X.Catch_me"; try X.boom () with |Catch_me x -> print_endline "Catch_me" Running this gives us: $ ocaml boom.ml Uncaught exception: A(S).Catch_me("X") Uncaught exception: A(S).Catch_me("Y") X.Catch_me Exception: A(S).Catch_me "X". The first two statements show the exception printer representing the exception as A(S), which is unhelpful when examining backtraces. This can be fixed with by registering a custom exception printer. The next two statements illustrate how you could go wrong if there are many similarly named exceptions, since there is no exhaustivity checking in the exception type. To write a generic function that catches any Catch_me, I believe you need a `try ... with _ -> ()` > Using polymorphic variants can be useful as it forces the user to > handle each possible error case. But that only works if you keep false > positives low. The reasoning behind explicit polymorphic variants here was to permit catching one class of error across all the device types: module A(S:sig val name : string end) = struct let boom () = `Catch_me S.name end module X = A(struct let name = "X" end) module Y = A(struct let name = "Y" end) let catcher rfn fn a = fn a |> function `Catch_me r -> rfn r let _ = X.boom () |> function `Catch_me r -> print_endline r; Y.boom () |> function `Catch_me r -> print_endline r; catcher print_endline X.boom (); catcher print_endline Y.boom () Here, the catcher function can be used across both functor applications (or, in Mirage terms, device drivers). However, writing more generic catcher functions that want to eliminate just one possibility from the variants and pass the others through is difficult, and leads to gems such as this from the OCaml-TLS Mirage code: type ('a, 'e, 'c) m = ([< `Ok of 'a | `Error of 'e | `Eof ] as 'c) Lwt.t let (>>==) (a : ('a, 'e, _) m) (f : 'a -> ('b, 'e, _) m) : ('b, 'e, _) m = a >>= function | `Ok x -> f x | `Error _ | `Eof as e -> return e > An `Unknown can only be logged or displayed, so it would be best to > handle it once in a catch block at the top level, not for every read. > Making the payload be of type exn rather than string would allow code > to handle individual cases if needed (e.g. a remote block device might > return Network_error, indicating that the client should reconnect and > try again). > `Unimplemented and `Disconnected presumably indicate bugs. For most > programs, it would only make sense to report them at the top level. > > `Is_read_only is irrelevant for read operations. > > I think it would be simpler if exceptions were used and the type > signature for read was just: > > val read: t -> int64 -> page_aligned_buffer list -> unit io > > For write, handling `Is_read_only might make sense, but it still seems > better to me to use an exception. After all, if the device is > read-only then you can't get it into a bad state, nor perform any > recovery on it. > > > In summary, I propose: > > - We should separate errors the caller should usually check for vs > bugs/internal errors. > > - Probably exceptions should be thrown in the second case. OCaml > programs already have to handle exceptions anyway. If not, `Unknown > should at least take an exn payload, not a string (print handlers > should be registered for any exceptions, of course). > > - For cases the caller should check, each function should declare its > own set of errors. DEVICE.read should not return `Is_read_only, > FS.format should not return `File_already_exists, etc. Yes, so by the above guidelines, we might turn this: type error = [ | `Unknown of string (** an undiagnosed error *) | `Unimplemented (** operation not yet implemented in the code *) | `Is_read_only (** you cannot write to a read/only instance *) | `Disconnected (** the device has been previously disconnected *) ] val read: t -> int64 -> page_aligned_buffer list -> [ `Error of error | `Ok of unit ] io val write: t -> int64 -> page_aligned_buffer list -> [ `Error of error | `Ok of unit ] io into: exception Unknown of string with sexp exception Unimplemented of string with sexp exception Disconnected with sexp type read_error = [ `Out_of_bounds ] with sexp type write_error = [ `Read_only_device | read_error ] with sexp val read : t -> int64 -> page_aligned_buffer list -> [ `Error of read_error | `Ok of unit ] io val write: t -> int64 -> page_aligned_buffer list -> [ `Error of write_error | `Ok of unit ] io This is also easier to contract, since a read-only device would simply skip the definition of `write_error` and `write`. We may need to do some more exception registration to tell us which module the expression came from (need to test that). > > Thoughts? Apologies if this has all been discussed to death before. Nope! If in doubt, raise an exception thread like this one :-) > I've did find > https://lists.cam.ac.uk/pipermail/cl-mirage/2012-March/msg00003.html, > which notes that Async doesn't cope with exceptions. Presumably that > could be wrapped away, though. It also suggests that there is some > problem with Lwt and backtraces, but I'm not sure what the issue is > there. These days, OCaml seems to keep track of when exceptions are > reraised and indicates this in the backtrace. Async's model is broadly along the same lines that you propose. It defines a process tree of monitors that catch uncaught exceptions and define local behaviour for those. In the Mirage world, our current flat device hierarchy would turn into a DSL for generating these monitor trees in the autogenerated code. The issue with backtraces is that if you drop into the cooperative thread scheduler between (>>=) binds, the sequence of operations that led to the function call are thrown away and the backtraces look like ("select -> function") instead of ("blocking1 -> select -> blocking2 -> select -> function"). Lwt has a syntax extension that registers the current backtrace with the backtrace system. I'm not sure that this is a big problem though, as we tend not to rely on backtraces for anything important (instead, the value that's thrown is more useful). Thoughts from others welcome -- I like Thomas' proposal and think that it finds the sweet spot between forcing fully generic handling of polymorphic variants and throwing exceptions everywhere. -anil _______________________________________________ MirageOS-devel mailing list MirageOS-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.