Sunday, 10 February 2008

The power of doing nothing....

There's an ongoing discussion on COCAN on how to make errors explicit in function signatures (the proposed solution is to go exceptionless). I will not delve on why I do not think this is a standard that should be adopted. Instead we will explore a very lightweight solutions that preserves the stack unwinding capability of exceptions whilst making the possible errors explicit in the signatures of the function.

Basic types

When one of our function is applied there we can have either one of two possible outcomes: everything goes right or we are left over with an error (notice the cheap mnemonics). Now if we get rid of exceptions we want our return types to be able all this information. To do so we mimics Haskell's either type:

type ('a,'b) either = Left of 'a | Right of 'b

In case you haden't caught my subtle hints Right is used to hold results and Left to hold errors.

For our short example we shall have a limited number of possible errors:

type possible_exc = [`Too_Small|`Too_Big|`Div_by_zero|`Some_other_error]

that can be thrown in an exception:

exception MyExc of [< possible_exc]

(Allthough I may not be using any unsafe features of OCaml I am actually exploiting a bug in the type-checker here. The absence of this hack would make one of our functions a little less precise).

External interface

Ideally we'd want to choose functions should raise an exception or return a wrapped value. This choice has to be made at function application. Bearing this in mind our external interface is:

module type External = sig
  type ('a,'b) value constraint 'a = [< possible_exc ]
  val run : ('a -> ('b,'c) value) -> 'a -> 'c
  val run_exn : ('a -> ('b,'c) value) -> 'a -> ('b,'c) either
end

Internal interface

The idea behind this implementation is to use a phantom type attached to a value we shall pass along. This phantom type will be unified with every single single value we might raise thus acting as a "collector" for all the possible exceptions. Without further ado let proceed:

module type Internal = sig
  type 'a t constraint 'a = [< possible_exc ]
  type ('a,'b) value constraint 'a = [< possible_exc ]
  val create : unit -> 'a t
  val raise : 'a t -> 'a -> 'b
  val return : 'a t -> 'b -> ('a,'b) value
  val lift : 'a t -> ('a,'b) value -> 'b
end

The type t is our collector. It can only be created via the create function. raise and return are rather self explanatory, lift might come as more of a surprise: it is used to get the returned values from one our function. It ensures the possible errors of the called functions are unified with the "collector" of the callee. The internal interface does not expose functions like External.run because it discards the possible error cases.

Implementation

Our interfaces being set we will now present an implementation. It has been stated at the beginning of this post that we want it as lightweight as possible. Since we will (ab)use the identity functions let's get it out of the way:

let ident = fun i -> i

We are not using the more efficient external "%identity" function because it is, in essence, not type safe. If we consider that the unit value is a value that carries no information (the unit type being inhabited by only one value) and the identity function is a function that does nothing our implementation really doesn't do much:

module Impl = struct
  type 'a t = unit constraint 'a = [< possible_exc ]
  type ('a,'b) value = 'b constraint 'a = [< possible_exc ]
  let create = ident
  let raise () e = raise (MyExc e)
  let return () = ident
  let run = ident
  let lift () = ident
  let run_exn f v = try Right (f v) with  MyExc v -> Left v
end

Excercising it

Let's take our shiny new modules for a test run. The types commented are the inferred types and thus the most general ones.

module I:Internal = Impl

(* int -> ([< possible_exc > `Too_Big `Too_Small ], unit) I.value *)
let check_range i =
  let t = I.create () in
  if i > 5 then
    I.raise t `Too_Big
  else if i < -5 then
    I.raise t `Too_Small
  else
    I.return t ()

(* 'a -> int -> ([< possible_exc > `Div_by_zero ], int) I.value *)
let div x y =
  let t = I.create () in
  if y = 0 then
    I.raise t `Div_by_zero
  else
    I.return t 0

(*
 int ->
 int ->
 ([< possible_exc > `Div_by_zero `Too_Big `Too_Small ], int) I.value
*)
let div_in_range x y =
  let e = I.create () in
  I.lift e (check_range x);
  I.lift e (check_range y);
  I.return e (I.lift e (div x y))

Conclusion

Even though this actually works it does not scale well to multiple modules. It should be considered as a proof of concept rather than serious code.

As there name imply exception should be used to handle exceptional cases. A well designed library should only throw exception when it encounters an error or a bug. Exception should only be used for error recovery (they might be used internally for hacks such as bugtracking but should never trickle out of the library). It seems cumbersome to impose the callee of library functions to handle a long trail of more and more exotic error cases.

If one really desires to have explicit error handling combining our either type with monads seems to be the sane route. This is exactly what the haskell error monad does.

Talking about haskell, an interesting read is 8 ways to report errors in Haskell. It gives a good overview on how messy and uncoherent error reporting can get.