[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Error handling in Mirage - request for comments!




Le vendredi, 30 janvier 2015 Ã 16:53, Thomas Leonard a Ãcrit :

> What problems do you see in Lwt's error handling?
>  

[...]  

> Isn't Lwt already an error monad? Can you define "well principled" here?


I meant the way they deal with exceptions (and cancelation). I think they 
should have lifted the error handling to value land and told users not to use 
exceptions (catch them between yields if you need to use them or use code that 
uses them) rather than try to cope with them. Besides, the fact that they use 
exceptions to perform cancellation itself leads to further absurdities (which 
seems to indicate that cancelation was afterthought, but that's the kind of 
concept you need think about from day 0 in a system to get to something), as I 
wrote elsewhere:

"Lwt has both cancelable and non-cancelable threads and uses an exception for 
thread cancellation. Sometimes this may lead to surprising results e.g. 
Lwt.pick [t, t'] may return a cancelled thread if t terminated and t' was 
cancelled."

Lwt's combinator algebra is broken: a cancelled thread should be a neutral 
element for Lwt.pick.  

My point here is that I think mirage should define it's own error monad and 
solve error handling in value land (w.r.t. the concurrency monads), 
independently from the higher-level concurrency primitives used. By providing 
appropriate combinators and other resource holding combinators that ensure a 
resource is only held in a given scope and that interact with that error monad 
so that we should not ever see that kind of horrible code:  

https://github.com/mirage/mirage-www/pull/274/files#diff-abb8c0c60c75065f86ff29942a483ec4R88

and automatically guide the programmer in doing the right thing.

> I'd agree if that was "abort the current transaction (which may cover
> the whole unikernel)". Realistically, there are always going to be
> error conditions that result in exceptions that should not terminate
> it (e.g. running out of memory serving requests should only abort some
> requests, etc).


Then these things should not be exceptions but be threaded through an 
appropriate error monad in combination with resource holding combinators to 
correctly relinquish held resources.  


> If I understand your position:
>  
> - Every exception raised MUST terminate the unikernel. This includes
> out-of-memory, division-by-zero, int_of_string on an out-of-range int,
> etc, in any code path. Aborting the operation (e.g. HTTP request) that
> caused the problem, logging the exception and continuing is not an
> option.

Yes. That is, anything non recoverable. int_of_string should in fact never have 
raised in the first place but have returned an option type.
  
> - It is therefore acceptable for a module to leak resources and/or
> leave the system in an invalid state if it it receives an exception
> from any code it calls.

No. If you need to recover then don't use exceptions. Use an error monad and 
resource holding combinators.

> I don't think exception safety is much work in most
> cases (I always try to do this in my own code, anyway).


If by exception safety you mean, handling exceptions correctly. Then no. 
There's always the exception you forget to handle, you have to put the handlers 
at the right place, and there's a lack of locality in program understanding 
(because exceptions may flow beyond a handler if it's not handled by it) that's 
too much thinking and subtelties. Besides there's no exhaustiveness check for 
exceptions. That's the reason why I prefer to have a monad that forces you into 
doing the right things by type direction.  

> I would however replace all the network codes with a generic
> (`Network_error of exn), where the exn might be e.g. a Refused
> exception with more information about why it was refused. This makes
> it easy for callers who don't care to handle them all at once (with
> Lwt.fail), allows extra network errors to be added by implementations,
> and allows attaching more details about the causes.

I'm really not fond of this. You can use open polymorphic variants for this or 
a closed one with a few known common cases and a custom one with a printable 
universal type. But then it seems you are designing with the idea of using 
Lwt's failed state as an error mechanism which I disagree with.  

Best,

Daniel



_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.