CSE 130 Lecture Notes

CSE130 LECTURE NOTES

January 17, 2001

ADMINISTRATION

Please use the course bulletin board (www.sdsc.edu/~ludaesch/CSE130/board/) for questions inspired by the lectures or the assignment.

ARRAYS AS (FINITE) MAPPINGS

A mapping is a mathematical object. An array is one way of implementing finite mappings.

Mathematically, a mapping (or function) from set A to set B is a set of ordered pairs where

(F1) the first item (the argument) in each pair is from A and the second item is from B
(F2) for each item x in A, there is a unique y (the result) in B such that the pair <x,y> is in the mapping.

Note that an array satisfies exactly these two properties:

Type-checking at compile-time verifies the first property.
If an array is uninitialized, and the programming language has runtime error-checking, then one gets an error message if the second condition is violated.

Traditional programming languages only allow index sets to be finite. This ensures that any value of the mapping type can be stored in memory. Often the restriction is even more extreme: an index type must be a Cartesian product of discrete finite types:

Example. Using Pascal style, we could write:

    type color = (red, green, blue); // Pascal enumeration
    type intensity = 0..255;         // a byte value in this range
    type screen = array [0..1023,0..767] // the finite index type  
		    of array [color] of intensity; // a complex value type

Remember that an array is a composite (or complex) value. For example, in Pascal we could have:

	var screen1, screen2 : screen; 
	...
	do_sth_to(screen2); 
	screen1 := screen2;

The simple assignment copies indeed a complex data structure. If you are "thinking C", this may come as a surprise (but it shouldn't; it is just another simple example of how a language paradigm influences the way you think about these issues!).

INFINITE MAPPINGS

Consider the mapping type string -> integer. A variable of this type might be convenient for storing student scores on a quiz, for example. Traditional programming languages do not allow this type. However some languages that are biased towards ease-of-use (rather than efficiency do), for example the "awk" language or Perl via associative arrays.

ASSOCIATIVE ARRAYS

An associative array is an unordered collection of data elements that are indexed by an equal number of values called keys. If the type of the latter is KeyT and the type of the data elements is DataT, then we can view an associative array as a mapping KeyT -> DataT.

In Perl, associative arrays are called hashes (see Section 5.6 of the textbook: Sebesta, Concepts of Programming Languages).

Question. What data structures can be used to implement the variables of type associative array, e.g., string -> integer implemented?
Answer: maybe a labeled binary tree sorted in "in-order", or by hashing. (Guess where the Perl name for associative arrays comes from ;-)

PL FUNCTIONS ARE VALUES OF A MAPPING TYPE ALSO

In standard PLs, we can't have a variable odd of type array [integer] of boolean so that

odd[-1] = true
odd[0] = false
odd[1] = true

and so on. But we can have a programming language function odd so that

odd(-1) = true
odd(0) = false
odd(1) = true

and so on.

Notice that the function odd could have two different implementations. One implementation might use mod, while the other might subtract or add two until 0 or 1 is reached.

OBSERVATIONS ABOUT FUNCTION VALUES

Many languages have many different syntaxes for creating mappings. Consider the Pascal style declarations:

type boolean = {false, true};
const not1: array [boolean] of boolean := [true, false];
function not2 (x: boolean): boolean;
begin
if x then return false else return true;
end;

Conceptually not1 and not2 both have the same mapping type, i.e. boolean -> boolean.

Just like numbers and records, functions have types and values. However, in conventional (non-functional) languages, functions are typically not "first-class citizens", i.e., we often cannot have variables whose type is "function".

Definition: A first-class value is one that can be assigned to a variable, passed as a parameter, returned as the result of a function, and included in a larger composite value, just like other values.

In typical PLs, being "first-class" is not all-or-nothing. Some values are "more first-class" and some values are "less first-class." For example, in some languages functions cannot return records. In other words, the result type of a function cannot be a Cartesian product type.

HIGHER-ORDER FUNCTIONS

A function or procedure is called "higher-order" if it takes another function as a parameter. For example, here is one way to compute the dot product of two vectors:

function f(j:int): real;
begin
    return a[j]*b[j];
end;
function sum (lo,hi:int; f: int->real): real;
    var s: real := 0;
begin
    for k := lo to hi do s := s + f(k);
    return s;
end;

Recall that a signature is description of a type. Here, the signature of "f" is int -> real and the signature of sum is int x int x (int -> real) -> real. Note that the symbol "x" inside a signature means Cartesian product, i.e. a record type.

ANOTHER EXAMPLE OF A HIGHER-ORDER FUNCTION

Here are two alternative cosine functions:

function cosine (x: real): real;
begin
    return sine(x + pi/2.0);
end;
function cosine (x: real): real;
    const epsilon = 1e-9;
begin
    return integrate(sine,0,x,epsilon);
end;

Each function implements the cosine function in a very different way, but for the same argument, both always yield similar results.

The signature of integrate is (real -> real) x real x real x real -> real. In Haskell we would write something like

 integrate :: ( (Float -> Float),  Float,  Float,  Float )  -> Float

or, even better (see below why):

 integrate :: (Float -> Float) -> Float -> Float -> Float  -> Float

Two functions with different implementations can have the same semantics, i.e. specification. The specification of a function is a mathematical mapping. Its implementation may be something different and more complicated that involves computation.

PURE VERSUS IMPURE FUNCTIONS

In mathematics, a function in A -> B satisfies (F1) and (F2) above. In programming, a function that satisfies this definition is called pure. Programming functions can be impure for any or all of several reasons:

Non-determinism
Side-effects
Hidden arguments.

One major difference between PLs is exactly what kinds of impure functions are allowed.

Note that the implementation of a function may use one or more of the sources of impurity just mentioned, but this impurity may be encapsulated (i.e. hidden from users of the function) and therefore the function remains pure from a mathematical perspective.

FUNCTION SIGNATURES IN HASKELL

The notation for denoting the set of all mappings from one set to another is ->. For example, cosine is a member of Float -> Float. (Recall that the words function and mapping are synonyms.)

By default, type expressions associate to the right, i.e.,

	a -> b -> c    

is read as

	a -> (b -> c)

For example, in Haskell we can write

	(1)	add       :: Integer -> Integer -> Integer
	(2) 	add x y   =  x + y

Here, (1) declares the type of add, (2) defines the value of add in a functional style: the value of "add x y", i.e., when given two arguments, is the sum of x and y. So the first (for the beginner more intuitive) reading of this is: (1) declares the type of add: given two integer arguments, the result of applying add to those two integers yields a result of type integer. However there is another "more powerful" reading of this. Let's make the association to the right explicit:

	(1)	add      :: Integer -> (Integer -> Integer)
	(2) 	add x y  =  x + y

This means that if we provide add only with one argument, e.g., if we write

	add 1

than this itself can be evaluated, because (1) can be read as "add takes and integer and returns a function of type Integer -> Integer.
What is this function in the case of add 1? It is the increment function!

Having said that, what is add 2 3? Function applications associate to the left, i.e., we read this as

	(add 2) 3

The inner parenthesis returns a function (now it is the "increment by 2" function). This function in turn is applied to 3, yielding a result of type integer.

CURRYING: "SPICING UP" TYPE DECLARATIONS OF FUNCTIONS

Above we have declared the type of add to be:

		add      :: Integer -> (Integer -> Integer)

Indeed, this is a curried function definition for addition. We could also have defined addition as follows:

	add'        :: (Integer, Integer) -> Integer
	add' (x,y)  =  x + y

Note that add' is similar to add in that we can say that "add' (x,y)" always yields the same result as "add x y". In this sense, a -> (b -> c) and (a x b) -> c can be regarded as equivalent.
The difference is that add is the "curried version" of add' and thus can be seen as a function with only one argument (in which case we speak of a partial function application. In contrast, add' requires a tuple of two arguments.

It was observed by the German philosopher and mathematician Gottlob Frege in 1893 that it suffices to restrict attention to functions of a single argument (e.g., by viewing (a x b) -> c as isomorphic to a -> (b -> c)).
In the 1920s, M. Schönfinkel worked On the building blocks of mathematical logic [Über die Bausteine der mathematischen Logik, Moses Schönfinkel, Mathematische Annalen, 1924]. This work sparked off the area of combinatory logic [Combinatory Logic, Haskell Curry, 1958]. This is why the above "trick" is called currying.

In the German speaking functional community, sometimes the term "schönfinkeln" is used (to honour M. Schönfinkel).

Bertram Ludaescher

Last modified: Mon Jan 22 23:15:21 PST 2001