CSE130 LECTURE NOTES

January 22, 2001

FUNCTIONS AND RELATIONS (Revisited)

Definition. Given a set A, PowSet(A), the powerset of A (sometimes also denoted 2A) is defined as the set of all subsets of A, i.e., PowSet(A) = { S | S is a subset of A}.

Example:

 for A = {a,b,c}, we have 
	PowSet(A) = { {},
		      {a}, {b}, {c},
	 	      {a,b}, {a,c}, {b,c},
		      {a,b,c} }
Q: How many subsets does a set A have, i.e., what is the cardinality of the set PowSet(A)?
A: An easy induction shows that if the cardinality of A is |A|, then the cardinality of Powset(A) is |PowSet(A)| = 2|A|. For example, for the A above, we get 23 = 8, i.e., A has 8 subsets.

Definition. Given two sets A and B, the Cartesian product of A and B, denoted "A x B" is defined as the set

 A x B = { (a,b) | a in A, b in B } . 
Thus A x B is a binary relation between A and B. It is obtained by "multiplying" each element of A with each element of B. Note that |A x B| = |A|*|B|.

[[The Cartesian product can be extended to n given relations A1 x ... x An in the obvious way; in this case we speak of an n-ary relation.]]

Q: So A x B is a (binary) relation over A and B. What other relations are there over A and B? How many are there?
A: Each subset of A x B constitutes one such relation, and these are all relations over A and B. Hence, the set of all possible relations over A and B is precisely the powerset of A x B.

Example: Let A = {a,b,c} and B = {false, true}. The Cartesian product A x B =

	{ (a,false), (a,true),
	  (b,false), (b,true),
	  (c,false), (c,true) }
has six elements. We have: |A x B| = |A|*|B| = 3*2 = 6. But "A x B" is just one (the "maximal") relation over A and B. Another one would be, for example
	R = { (b,false), (a,true), (c,false) }
Note that
-- the order of elements in a set is irrelevant: an element is "in" or "out" -- nothing else. In contrast, in a list, elements can occur multiple times and are ordered. We will return to lists later.
-- since the relation R assigns to each (and every) element of A, exactly one element of B, we can say that R is a function from A to B. In this sense, functions are special relations.

Coming back to our question -- how many such relations R there over A and B -- we find that the PowSet(A x B) has 32 elements, i.e., there are 32 binary relations over A and B because

|PowSet(A X B)| = 2|A x B| =  2|A|x|B| = 23*2 = 32

Above we have seen that a function f from A to B (in Haskell f :: A -> B) is just a special relation (where there is exactly one function value f(a) from B for each a in A). But relations can also be seen as special functions, by viewing them as predicates which map each tuple (a,b) of the Cartesian product to either true (so "(a,b)" is in the relation R) or false (if "(a,b)" is not in R).

FUNCTIONS SHOULD BE FIRST-CLASS VALUES

Definition: A first-class value is one that can has all the standard operations that should be available for all values of all types.  A value is first-class if it can be: In typical PLs, being "first-class" is not all-or-nothing.  Some values are more first-class and some values are less first-class.

The property that the fewest values usually possess is the ability to be returned as the result of a function.  For example, in Pascal functions cannot return records.  In other words, the result type of a function cannot be a Cartesian product type.

In standard PLs such as C, functions are not first-class. In higher-level, more modern PLs such as Haskell, functions are first-class.

Definition. A routine is called higher-order if it takes another function as a parameter, and/or returns a function as its result.

TAGGED UNIONS

We'll return to higher-order functions and Haskell.  For now, let's finish our investigation of operators for creating composite types.

Remember, a type is a set of values. In mathematics, one of the most basic ways of making a new set is to take the union of two given sets.

In programming, different types have different legal operations. When we make a union, we need to still be able to identify which operations are legal on an element of the union.

Each member of the union type must be "tagged" to indicate what type it came from.  In Pascal:

type
   country = (canada, usa);
   zipcode =
       record
           case where: country of
              canada: (czip: string);
              usa : (azip: number)
       end;
Now we can use a switch to do different operations on Canadian and American zipcodes.  Note that the switch command is semantically different from the case notation for declaring a union type.  It's unfortunate that the same keyword case is used for both concepts.
 
 

THE DIFFERENCE BETWEEN A TAGGED UNION AND A CARTESIAN PRODUCT

Unions and products are very different ideas. [[However in practice, most unions are unions of products.]]

Many programming languages like Pascal have one notation that combines unions and products. C has separate notations using the keywords union and struct.

Tagged unions are also called disjoint unions. The idea is that no value of type zipcode can have a czip component and also an azip component. It's one or the other.

In Pascal and C unions are not required to be tagged. This is a notorious source of runtime errors!
 
 

RECURSIVE TYPES

A recursive type is a higher-level, mathematical version of a data structure using pointers.

These are a category of types that standard languages don't permit, but which are very useful. With 1980s compiler algorithms, they can be implemented efficiently.

Using pointers, the programmer must allocate and recapture memory, which is low-level.

For local variables (which are kept on the stack) and for intermediate storage used in arithmetic subexpressions, memory is managed automatically.

Lists are recursive

Think about the type of lists. There are two fundamental differences between lists and tuples: Informally, a list of integers consists of an integer, possibly followed by another list of integers.

Rather more precisely, we can say that a list is either the empty list, or a pair consisting of a "first" element which is an integer and a "second" element which is itself a list of integers.

Formally let's write (in Haskell notation)

data Integerlist = Nil | (Integer, Integerlist)

Here "|" means (tagged) union and "," means tuple, i.e., "(Integer, Integerlist)" is the Haskell notation for the Cartesian product "Integer x Integerlist".

Note that this equation is circular: the type Integerlist is defined in terms of itself. In such a case we speak of a recursive definition.

However the definition is not meaningless because it has a base case (the empty list of length zero, which doesn't contain any integer, and which we just write Nil), and the recursive part is applied in some "well-founded" ("terminating") way:

Here are some examples of values of the type Integerlist:

Nil
(13, Nil)
(13, (14, Nil))
(13, (4, (8, (16, Nil))))
The last three examples are all special cases of the fact that (13,x) is an Integerlist for any x such that x is an Integerlist. Since list are a very important data structure, Haskell has a shorthand notation for it:
[]
[13]
[13, 14]
[13, 4, 8, 16]
Nevertheless, you should think of lists as having the form
	(x, xs)
where x is the first element (the head) of the list, and xs is the restlist.

What are the types of x and xs? Above x was of type Integer and xs was of (the recursive) type Integerlist.

POLYMORPHIC TYPES

In Haskell we can do better than the above type Integerlist. Instead we can define an polymorphic type that works for arbitrary base types.

For example, we can use a (recursive) type constructor called MyList that uses a data constructor called Cons to obtain a polymorphic list type:

	data MyList = Nil | Cons a (MyList a)
Here "a" is a type variable that can be instantiated to any type, e.g., Integer, String, or Person. For example the equivalent of the Haskell list
	myList :: [Integer]
	myList =  [13,14,7] 
in our "self-made" recursive type would be
	myList :: MyList Integer
	myList = Cons 13 (Cons 14 (Cons 7 Nil))

RECURSIVE TYPES IN TRADITIONAL LANGUAGES

In C and Pascal and other similar languages, the programmer must implement a recursive type using pointers.  For example lists of integers can be defined in Pascal as follows:
type
    listchoice = (nil, cons);

    intlistptr = ^ intlistrecord;

    intlistrecord =
        record
            case flag: listchoice of
                nil: ();
                cons: (first: integer; rest: intlistptr)
        end;

This is basically how lists are implemented in Haskell, Lisp, and Prolog also. The difference is that the compiler generates code like the above.  This is one justification for calling those languages "higher-level".

The tradeoffs between using explicit pointers (as in C) and recursive types (as in Haskell) are

Java hides pointers from the programmer, but not completely, like Haskell does.

INTRODUCTION TO FUNCTIONAL PROGRAMMING WITH HASKELL

[[cf. the A Gentle Introduction to Haskell on which these notes are based]]

Values and Types in Haskell

All Haskell values (in particular functions!) are "first-class": they may be passed as arguments to functions, returned as results, placed in data structures, etc. [[Haskell types, on the other hand, are different "beasts" and are not first-class.]] Types in a sense describe values, and the association of a value with its type is called a typing. Using the examples of values and types above, we write typings as follows:
                          5  :: Integer 
                         'a' :: Char
                         inc :: Integer -> Integer
                     [1,2,3] :: [Integer]
                     ('b',4) :: (Char,Integer)
The "::" can be read "has type". The above are type declarations.
(Note how the function 'inc' just hides between the "normal" values ;-)

While "conventional" data values like stand for themselves, functions need a function definition in order for the system to be able to evaluate the function (i.e., apply the function to some given argument(s)).

Functions in Haskell are normally defined by a series of equations. For example, the function inc can be defined by the single equation:

	inc n          = n+1

Polymorphic Types

The list [1,2,3] in Haskell is actually shorthand for the list 1:(2:(3:[])), where "[]" is the empty list and ":" is the infix operator (our data constructor "Cons" from above) that adds its first argument to the front of its second argument (a list).
Since ":" is right associative, we can also write this list as 1:2:3:[]. As an example of a user-defined function that operates on lists, consider the problem of counting the number of elements in a list:
	length                  :: [a] -> Integer
	length []               =  0
	length (x:xs)           =  1 + length xs

This definition is almost self-explanatory. We can read the equations as saying: "The length of the empty list is 0, and the length of a list whose first element is x and remainder is xs is 1 plus the length of xs." (Note the naming convention used here; xs is the plural of x, and should be read that way.)

HIGHER ORDER FUNCTIONS (Revisited)

Consider a function "map" which takes as input another function f :: (a->b) and a list of type a, and which returns for each element x of the input list f(x). For example
    map (add 1) [2,4,5]    =>   [3,5,6]  
This is an example of a higher order function (i.e., a function which has other functions as input or output). The function definition of map is very simple:
	map f  []               =  []
	map f (x:xs)            =  f x : map f xs
Q: What is the signature for map?

[[Preview ("stay with us; after we come back from the break"): examples for programming with Haskell and list comprehensions...

quicksort  []           =  []
quicksort (x:xs)        =  quicksort [y | y <- xs, y<x ] 
                        ++ [x]
                        ++ quicksort [y | y <- xs, y>=x]
]]
Bertram Ludaescher
Last modified: Mon Jan 22 23:15:36 PST 2001