CSE 130 Lecture Notes

CSE130 LECTURE NOTES

January 29, 2001

FUNCTIONAL PROGRAMMING WITH HASKELL (Continued)

When programming with functions, we distinguish between a function's ...

... type (signature) declaration, e.g.,
```
	fac :: Integer -> Integer
```
... definition, e.g.,
```
	fac 0 = 1
	fac n = n * fac (n-1)
```

... application, e.g.,

 
	Hugs> fac 100   [ENTER] => 
	933262154439441526816992388562667004907159682643816214685929638952175\
	999932299156089414639761565182862536979208272237582511852109168640000\
	00000000000000000000

Note that this definition of the factorial function works with "Integer", i.e., infinite-precision integers. If we declare it as "Int", we will get an overflow error when computing large factorials.

TYPE INFERENCE

We have seen examples of type inference which allows to type-check expressions. At compile-time, i.e., when a function definition is loaded into the system, we typically provide the function's type definition (signature), e.g., fac :: Integer -> Integer above. If we don't, then the system can still infer the type:

	Hugs> :info fac
	fac :: Num a => a -> a

This means that the system has inferred that the type of fac is "a -> a", provided "a" is a numeric type. Similarly, we can ask the type of our famous map function:

	:info map
	map :: (a -> b) -> [a] -> [b]

Whether or not we give the function's type declaration, the system can statically, i.e., at compile-time check for type errors:

fac 0 = 'a'                      
fac n = n * fac (n-1)

The first equation implies a type "Num a -> Char" which is inconsistent with the second equation:

ERROR "ln6.hs" (line 18): Instance of Num Char required for definition of fac

To err is human, and we may do things like this:

fac 0 = 1                           
fac n = n * fac [n-1]		-- oops wrong type of parentheses

The static type inference of Haskell tells us at compile-time that something is wrong:

ERROR "ln6.hs" (line 19): Type error in function binding
*** Term           : fac
*** Type           : a -> a
*** Does not match : [a] -> a
*** Because        : unification would give infinite type

"Intelligent" static type inference is one of the features of functional languages that helps avoiding design flaws very early (before they become "nasty bugs" that need debugging at runtime!)

At runtime, when functions are evaluated (or "applied", i.e., reduced to their normal form), the type checking mechanism is also used:

Hugs> fac [a]
ERROR: Undefined variable "a"
Hugs> fac [10]
ERROR: Illegal Haskell 98 class constraint in inferred type
*** Expression : fac [10]
*** Type       : (Num a, Num [a]) => [a]

EQUATIONS AND GUARDS

There are several alternatives to define functions in Haskell:

fac1 n = product [1..n]                 -- a simple definition

Here we simply use the function "product" and the convenient list notation [1..n]. We can find out about the type of "product" and "fac1" as follows:

Hugs> :info product
product :: Num a => [a] -> a

Hugs> :info fac1
fac1 :: (Num a, Enum a) => a -> a

So "product" takes a list of type "a" (provided it is a numeric type!) and returns a value of type "a".
For "fac1" we get that it maps from type "a" to "a", provided that "a" is a numeric type ("Num a") and an enumerated type ("Enum a")! Why??

Other ways to define the fac function are:

fac2 n  =  if n==0			-- a recursive definition
                then 1 
		else n * fac2 (n-1)  

fac3 0 = 1                              -- using two equations
fac3 n = n * fac3 (n-1)


fac4 n | n == 0		= 0		-- using guards
       | n > 0		= n * fac4 (n-1)

Observe that the guards in "fac4" are used on the left-hand side of the function definition.

PATTERN MATCHING

When defining functions via equations, we can use pattern matching to conveniently distinguish different cases:

	map f  []               =  []
	map f (x:xs)            =  f x : map f xs

Pattern matching can either fail, succeed or diverge. A successful match "binds" the formal parameters in the pattern. Divergence occurs when a value needed by the pattern contains an error (written "_|_" and pronounced "bottom" or "undefined").

Example: if we call "map inc [1,2,3]" then the list "[1,2,3]" fails the match with "[]"; but the match with "x:xs" succeeds. This succeeding match will bind the formal parameter "x" to the first element of the list (here: "1") and formal parameter "xs" to the restlist (here: "[2,3]") . Btw: assuming that "inc" is the increment function inc n = n+1, then we eventually get map inc [1,2,3] ==> [2,3,4].
Had we instead called map inc 7, then the pattern "diverges", i.e., it's neither a succeed nor a fail, but a type error (recall that map :: (a->b) -> [a] -> [b], so the type of the second argument must be a list!).

Q: What will happen if we call, e.g., map f "Joyce" ? Does it make sense? What type should the function "f" have?

Another common situation is matching against a value we really care nothing about. For example, the functions head and tail can be written as:

head (x:_)             = x
tail (_:xs)            = xs

This makes very explicit the fact that we don't care what a certain part of the input is. Each wild-card independently matches anything, but in contrast to a formal parameter, each binds nothing; for this reason more than one is allowed in an equation.

RELEVANCE OF ORDER

The matching process itself occurs top-down, left-to-right. Failure of a pattern anywhere in one equation results in failure of the whole equation, and the next equation is then tried. If all equations fail, the value of the function application is _|_, and results in a runtime error.

Consider, e.g., the two equations which define "fac3". Can we change their order, i.e., write:

	fac3 n = n * fac3 (n-1)		-- recursive case first
	fac3 0 = 1                      -- base case

Let's run it in Hugs:

	Hugs> fac3 5
	ERROR: Control stack overflow

What happened?? Since the first equation is tried first, we end up in this branch, even for n=0. Hence fac3 "gets stuck" in the first equation by calling itself recursively until a stack overflow occurs!

The remedy of course is to be aware of the pattern matching semantics, i.e., top-down, left-to-right and order the equation accordingly. Alternatively, we can use guards as shown above, to make disjoint cases, i.e., no two equations "cover" the same case but rather are mutually exclusive. In this way, the order of the (guarded) equations becomes irrelevant!