[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Error handling in Mirage

On 10 Jul 2014, at 10:01, Thomas Leonard <talex5@xxxxxxxxx> wrote:
> I found it awkward too. What's the rationale behind Mirage's current
> error handling?

The basic problem with using exceptions is if they are constructed in
a functor, they have a different representation across applications of
that functor.  Consider:

exception Catch_me of string

module A(S:sig val name : string end) = struct
  exception Catch_me of string
  let boom () = raise (Catch_me S.name)

module X = A(struct let name = "X" end)
module Y = A(struct let name = "Y" end)

let _ =
  try Printexc.print X.boom () with _ -> ();
  try Printexc.print Y.boom () with _ -> ();
  try X.boom () with |X.Catch_me x -> print_endline "X.Catch_me";
  try X.boom () with |Catch_me x -> print_endline "Catch_me"

Running this gives us:

$ ocaml boom.ml 
Uncaught exception: A(S).Catch_me("X")
Uncaught exception: A(S).Catch_me("Y")
Exception: A(S).Catch_me "X".

The first two statements show the exception printer representing the
exception as A(S), which is unhelpful when examining backtraces. This
can be fixed with by registering a custom exception printer.

The next two statements illustrate how you could go wrong if there
are many similarly named exceptions, since there is no exhaustivity
checking in the exception type.  To write a generic function that
catches any Catch_me, I believe you need a `try ... with _ -> ()`

> Using polymorphic variants can be useful as it forces the user to
> handle each possible error case. But that only works if you keep false
> positives low.

The reasoning behind explicit polymorphic variants here was to permit
catching one class of error across all the device types:

module A(S:sig val name : string end) = struct
  let boom () = `Catch_me S.name

module X = A(struct let name = "X" end)
module Y = A(struct let name = "Y" end)

let catcher rfn fn a = fn a |> function `Catch_me r -> rfn r

let _ =
  X.boom () |> function `Catch_me r -> print_endline r;
  Y.boom () |> function `Catch_me r -> print_endline r;
  catcher print_endline X.boom ();
  catcher print_endline Y.boom ()

Here, the catcher function can be used across both functor
applications (or, in Mirage terms, device drivers).

However, writing more generic catcher functions that want
to eliminate just one possibility from the variants and
pass the others through is difficult, and leads to gems
such as this from the OCaml-TLS Mirage code:

type ('a, 'e, 'c) m =
  ([< `Ok of 'a | `Error of 'e | `Eof ] as 'c) Lwt.t

let (>>==) (a : ('a, 'e, _) m) (f : 'a -> ('b, 'e, _) m) : ('b, 'e, _) m =
  a >>= function
  | `Ok x                -> f x
  | `Error _ | `Eof as e -> return e

> An `Unknown can only be logged or displayed, so it would be best to
> handle it once in a catch block at the top level, not for every read.
> Making the payload be of type exn rather than string would allow code
> to handle individual cases if needed (e.g. a remote block device might
> return Network_error, indicating that the client should reconnect and
> try again).

> `Unimplemented and `Disconnected presumably indicate bugs. For most
> programs, it would only make sense to report them at the top level.
> `Is_read_only is irrelevant for read operations.
> I think it would be simpler if exceptions were used and the type
> signature for read was just:
>  val read: t -> int64 -> page_aligned_buffer list -> unit io
> For write, handling `Is_read_only might make sense, but it still seems
> better to me to use an exception. After all, if the device is
> read-only then you can't get it into a bad state, nor perform any
> recovery on it.

> In summary, I propose:
> - We should separate errors the caller should usually check for vs
> bugs/internal errors.
> - Probably exceptions should be thrown in the second case. OCaml
> programs already have to handle exceptions anyway. If not, `Unknown
> should at least take an exn payload, not a string (print handlers
> should be registered for any exceptions, of course).
> - For cases the caller should check, each function should declare its
> own set of errors. DEVICE.read should not return `Is_read_only,
> FS.format should not return `File_already_exists, etc.

Yes, so by the above guidelines, we might turn this:

type error = [
 | `Unknown of string (** an undiagnosed error *)
 | `Unimplemented     (** operation not yet implemented in the code *)
 | `Is_read_only      (** you cannot write to a read/only instance *)
 | `Disconnected      (** the device has been previously disconnected *)

val read: t -> int64 -> page_aligned_buffer list -> [ `Error of error | `Ok of 
unit ] io
val write: t -> int64 -> page_aligned_buffer list -> [ `Error of error | `Ok of 
unit ] io


exception Unknown of string with sexp
exception Unimplemented of string with sexp
exception Disconnected with sexp

type read_error  = [ `Out_of_bounds ] with sexp
type write_error = [ `Read_only_device | read_error ] with sexp

val read : t -> int64 -> page_aligned_buffer list -> [ `Error of read_error | 
`Ok of unit ] io
val write: t -> int64 -> page_aligned_buffer list -> [ `Error of write_error | 
`Ok of unit ] io

This is also easier to contract, since a read-only device would simply skip
the definition of `write_error` and `write`.  We may need to do some
more exception registration to tell us which module the expression came
from (need to test that).

> Thoughts? Apologies if this has all been discussed to death before.

Nope! If in doubt, raise an exception thread like this one :-)

> I've did find 
> https://lists.cam.ac.uk/pipermail/cl-mirage/2012-March/msg00003.html,
> which notes that Async doesn't cope with exceptions. Presumably that
> could be wrapped away, though. It also suggests that there is some
> problem with Lwt and backtraces, but I'm not sure what the issue is
> there. These days, OCaml seems to keep track of when exceptions are
> reraised and indicates this in the backtrace.

Async's model is broadly along the same lines that you propose.  It defines
a process tree of monitors that catch uncaught exceptions and define local
behaviour for those.  In the Mirage world, our current flat device hierarchy
would turn into a DSL for generating these monitor trees in the autogenerated

The issue with backtraces is that if you drop into the cooperative thread
scheduler between (>>=) binds, the sequence of operations that led to the
function call are thrown away and the backtraces look like ("select -> 
instead of ("blocking1 -> select -> blocking2 -> select -> function").

Lwt has a syntax extension that registers the current backtrace with 
the backtrace system.  I'm not sure that this is a big problem though,
as we tend not to rely on backtraces for anything important (instead, the
value that's thrown is more useful).

Thoughts from others welcome -- I like Thomas' proposal and think that it
finds the sweet spot between forcing fully generic handling of polymorphic
variants and throwing exceptions everywhere.

MirageOS-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.