[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Error handling in Mirage - request for comments!



On 30 January 2015 at 10:24, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:
> On 30 Jan 2015, at 10:06, Thomas Leonard <talex5@xxxxxxxxx> wrote:
>>
>> On 30 January 2015 at 09:30, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:
>>> On 29 Jan 2015, at 15:24, Thomas Leonard <talex5@xxxxxxxxx> wrote:
>>>>
>>>> As part of my continuing mission to break all Mirage APIs, I've
>>>> written up some thoughts on how to improve error handling:
>>>
>>> s/break/fix :-)
>>>
>>>>
>>>> https://github.com/mirage/mirage-www/pull/274
>>>>
>>>> Although written as if it's a final design, it's intended only as a
>>>> starting point for discussion, to find out what we do and don't agree
>>>> on. Please add comments, information about successful approaches
>>>> you've seen, etc.
>>>
>>> This is an excellent writeup.  My top-level view is that moving to
>>> an exception-heavier model is fine, but that we really do need to adopt
>>> some sort of Async-style monitor model to make this feasible, so that
>>> exceptions can be contained within a logical section of the code.
>>
>> Doesn't try_lwt (or similar) do this anyway? What particular problem
>> are you worried about?
>>
>
> It does, if used carefully everywhere -- and is quite slow.  The
> problem is along the lines of:
>
> Thread 1: try
> Thread 1:   <code>
> Thread 1:   Lwt.wakeup thread2
> Thread 2:   <fast switch to thread2>
> Thread 2:   raise Failure
> Thread 1: catch
>
> The fast switch has caused thread 1 to catch the Failure.  With monitors,
> there's always a monitor relevant to the active thread that is listening
> for an exception on that thread.  It's a global variable so that thread
> switching remains a fast operation,

A few thoughts on this:

1) Moving to using more exceptions doesn't mean using "raise". If you
replace "raise" with "fail" then the fast switching problem goes away
(you get a failed thread whether it switches fast or not).

2) I don't see why I'm any more likely to remember to install a
monitor than to remember to use a try_lwt block. There is the issue
that if a Lwt thread fails and noone is interested in it then the
error goes unreported. However, ignoring a result requires an explicit
action in OCaml so this shouldn't happen accidentally.

3) In the case of Lwt, using try instead of try_lwt means the catch
block might not run, so an error may be reported when it should have
been handled. Not great, but OK - we just abort a larger transaction
than was strictly necessary. In the case of using try instead of a
monitor in Async you have the same problem, but also the additional
problem that threads waiting for thread2 will never be notified and
cannot report the error or clean up their resources.

>> I don't have any experience with Async monitors beyond playing with
>> them briefly yesterday after reading the RWO chapter on Async. But in
>> this case Lwt seems to be using a clean, functional style, whereas
>> Async is using global variables and hidden implicit state that is
>> likely to lead to bugs such as the one I used in the example.
>>
>> Though I may just be misunderstanding. Apart from that, Async seemed
>> very sensible, and better than Lwt in places (e.g. always scheduling
>> binds for later so they always run in the same context).
>
> The tradeoff is definitely a performance one.  We could get roughly the
> same behaviour with try_lwt applied everywhere, I suspect.


-- 
Dr Thomas Leonard        http://0install.net/
GPG: 9242 9807 C985 3C07 44A6  8B9A AE07 8280 59A5 3CC1
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.