CSE 130 Lecture Notes

CSE130 LECTURE NOTES

February 21th, 2001

MISCELLANEOUS

Class schedule update:
- Wed, Feb 28: 2:30-3:20pm additional (short) lecture (swapped with Mar 5 discussion section)
- Mon, Mar 5: 4:40-6pm discussion section (swapped with Feb. 28 lecture)
- Wed, Mar 7: 2:30-3:20pm additional (short) lecture
- Mon, Mar 12: ...
- Wed, Mar 14: ...
- Fri, Mar 23, 3-6pm: FINAL, SOLIS 107
Future assignments/group projects (comparative project)

MIDTERM Results and Comments

mean = 51 (82), standard deviation = 15
one could see who has actually programmed the Haskell assignments (and followed the class) and who hasn't...
(btw: the Final will be cumulative!)
Problem 1 (Composite Data Types):
- enumeration type:
```
data MyColor    = Red | Green | Blue 
                deriving (Enum, Show)
```
  Now you can do:
```
    Main> fromEnum Red
    0 :: Int
    Main> fromEnum Blue
    2 :: Int
    Main> [fromEnum Red .. fromEnum Blue]
    [0,1,2] :: [Int]
    Main> (toEnum :: Int->MyColor) 1
    Green :: MyColor
```
  And this is how you can enumerate the values:
```
mycols0     = map (toEnum::Int -> MyColor) [fromEnum Red .. fromEnum Blue]
```
  This will evaluate to:
```
    Main> mycols0
    [Red,Green,Blue] :: [MyColor]
```
- union ("sum") type (sometimes: "tagged union", "variant records"):
```
data Person       = ShortC Name | LongC Firstname Lastname Age
                  deriving Show

type Name         = String
type Firstname    = String
type Lastname     = String
type Age          = Int


myAddrDB          :: [Person]
myAddrDB           = [ShortC "Peter", LongC "Tom" "Cat" 11]
```
- polymorphic types: NOT: a tree can be this or that, BUT: a set of types; the concrete type (say Int, [Char], etc.) can be substituted ("instantiated") for the type variable (convention: use a, b, ... as type variables)
  Example: given a polymorphic type Stack a, the first element pushed onto any given stack will determine what actual type will be used (for a). Often operations are polymorphic themselves:
```
push :: a -> Stack a -> Stack a
```
  (compare this with the Java ADT below)
- associative vs "regular" arrays: conventional arrays have (finite subranges of) enumeration types (often: integer, e.g., 0..k) as index/key type, e.g. myarray :: [0..99] -> String; associative arrays can have arbitrary index/key types, e.g., myassocarray :: String -> String
- arrays can be seen as finite functions (finite, because the domain of the function is finite!); finite functions can be implemented as arrays
Problem 2:
7::Integer, Integer -> Integer, "decrement", add2 is not curried (hence expects 2 arguments!)

ABSTRACT DATATYPES (Revisited)

An ADT has a public specification and a private (hidden) implementation.
This allows modular programming (in Java: classes and interfaces, in Haskell: modules)
The specification contains the name, the signature (i.e., interface), and the axioms (can be formal mathematical statements/equations, plain English, pseudo code, ...) of the ADT
The implementation specifies the concrete data structures and the "program code" (function definitions, procedures, methods, ...) that implements the ADT.
In Haskell, the data constructors are part of the concrete type, i.e., the private implementation part of the ADT -- they are not exported (if we use the module system)! Instead we have to use the (interface) operations of the ADT.

STACK ADT revisited in JAVA

Consider the following (variation) of the Stack specification (using Haskell syntax):

create  :: Stack a			-- create an empty stack 
push    :: a -> Stack a -> Stack a	-- push an element on the stack
pop     :: Stack a -> (a, Stack a)	-- pop an element off the stack
empty   :: Stack a -> Bool		-- is the stack empty?

Note the modifed pop which now returns the popped off (formerly topmost) element. Hence the axioms for pop change:

...
pop (push x s) = (x, s)
pop (create) ==>  error
...

Implementation of Stack ADT in Java

The interface to the Stack ADT is defined as follows (file = "MyStack.java"):

public interface MyStack
{
     public void push (Object j);
     public Object pop();
     public boolean isEmpty();
}

NOTE: What are the signatures of push, pop, isEmpty? What happened to the 2nd argument of push? (cf. the Haskell signature)

After creating an instance of MyStack, a "push" message can be sent to the newly created instance.
In this sense, the 2nd argument (the stack) of push is still there! It just has become the object to which you send the push message with the to-be-pushed element as an argument (see below).

So how do you use the Java stack ADT? (Mind you: for using the ADT we don't have to know how the ADT is implemented -- knowing the interface is enough...)

A simple stack testing class (file="StackTest.java"):

public class StackTest
{
     public static void main(String[] args)
	  {
	       int i;

	       ArrayStack myStack = new ArrayStack(args.length);
	       
	       
	       // put args in stack
	       //
	       for (i=0; i<args.length; i++)
	       {
		    myStack.push(args[i]);
	       }
	       
	       // print out by popping out of stack
	       //
	       for (i=0; i<args.length; i++)
	       {
		    System.out.println(myStack.pop());
	       }
	  }
}

So let's look at an implementation using arrays:

public class ArrayStack implements MyStack
{
     private Object[] stackArray;
     private int maxSize, top;
     public ArrayStack( int s )
	  {
	       maxSize = s;
	       stackArray = new Object[ maxSize ];
	       top = -1;
	  }
     public void push( Object j )
	  { stackArray[ ++top ] = j; }
     public Object pop()
	  { return stackArray[ top-- ]; }
     public boolean isEmpty()
	  { return ( top == -1 ); }
}

Here is another list implementation (so the signatures are not the same) based on a slightly different interface called MyOtherStack:

[Exercise: (a) give the interface for MyOtherStack; (b) adjust this implementation so that it works with the previous interface MyStack]

public class ListStack implements MyOtherStack
  {
  private Node top; 
  private Int nItems; 

  public create () { top = null; nItems = 0; }
  public void push( Object j )
	{ 
	Node newNode = new Node( j, top ); 
	top = newNode; nItems++; 
	}
  public Node pop()
	{ 
	Object tmp = top.getKey(); 
	top = top.getNext(); nItems--;
	return tmp; 
	}
  public boolean empty() { return ( top == null ); }
}

public class Node
  {
  private Object key; 
  private Node next; 

  public Node( Object k, Node n )
	{ key = k; next = n; }
  public Object getKey()
	  { return key; }
  public Node getNext()
	  { return next; }
  public void setKey( Object k )
	  { key = k; }
  public void setNext( Node n )
	  { next = n; }
  }

BACK TO IMPERATIVE PLs

A variable is an identifier together with the value it is bound to. Variables are created, and maybe initialized by, declarations.

In most programming languages, variables are updatable. This means that the actual value of a variable is a memory address. Assignment commands then change the value stored at this address.

Definition: In a (pure) functional language variables are not updatable. An imperative language does have updatable variables.

Since they do not have updatable variables, (pure) functional languages do not have commands with side effects. Haskell is a pure functional language.

In an imperative language like Pascal, a const declaration is the same thing as a value declaration (mytree0 = ...) in a functional language like Haskell.

BINDINGS AND ENVIRONMENTS

Definition: For a functional language, an environment is a mapping from names to values.

In other words, an environment is a set of bindings where each binding is an ordered pair (name, value). The simplest way to implement an environment is to use a list of pairs. This is adequate theoretically but for practical use a more efficient data structure must be used.

Environments are ubiquitous in PLs. A PL implementation must keep track of environments both at compile-time and at run-time.

Exercise: how would you implement environments in Haskell? How in Java or C?

NESTED ENVIRONMENTS AND SCOPES

Let's start with three definitions, and then see how the definitions work in an example.

Definitions:

A binding occurrence of an identifier is an appearance of the identifier where its binding gets established.
An applied occurrence of an identifier is an appearance of the identifier where its binding gets used.
The scope of (a binding occurrence of) an identifier is the region of the program where applied occurrences refer back to this binding occurrence

=>> the scope of an identifier (often: a variable) is the range in which that identifier is visible.

Here's the example (in some Pascal-style syntax):

begin 
    const pi; real pi := 3.14159; real x := 0.0; 
    procedure foobar(real z) is ...; 

    function tan(real x):real; 
    begin 
        real arg := x mod (2.0*pi); 
        return sin(arg)/cos(arg) 
    end; 

    ... foobar(x); ... 
end;

(1) "pi" is bound to a constant real value. "x" is bound twice, once to a real-valued variable, and once to a real-valued parameter.

(2) There are three environments involved here: the global environment which has bindings for "pi", "x", "foobar", and "tan", and two "extended environments" which additionally include bindings for (function/procedure) parameters (more precisely: parameter identifiers!).

(3) There are two related binding occurrences of "pi" and two unrelated binding occurrences of "x".

(4) There are two unrelated applied occurrences of "x".

(5) The scope of the binding occurrences of "pi" is the whole program. The scope of the first binding occurrence of "x" is the main program body. The scope of the second binding occurrence of "x" is the body of "tan".

NESTED BLOCK-STRUCTURED SCOPING

All the important modern languages share the same basic scoping discipline.

Remember the definition: The "scope" of (a binding occurrence of) an identifier is the region of the program where applied occurrences refer back to this binding occurrence.

Under nested block-structured scoping, the essential rules are that:

the scope of a binding is the entire block in which the binding happens,
except a binding in an inner block hides a binding of the same identifier in an enclosing block.

Nested block-structured scoping says that free variables are understood to refer to a syntactically-enclosing binding. Thus it is usually called static scoping.

RECURSIVE DEFINITIONS

The scope of a binding extends to the end of the enclosing block. But usually it starts where the declaration appears, not at the start of the enclosing block.

Recursion among definitions can cause difficulties. Consider two types to keep track of the white and black squares on a chessboard:

type white = record 
                n1, n2, n3, n4: ^black; 
             end; 
type black = record 
                n1, n2, n3, n4: ^white; 
             end;

Most languages have ad hoc syntax for coping with mutual recursion. For example in Pascal, procedures can have "forward" bodies, and in the type "pointer to X" the type X does not have to be already defined.

ALPHA-CONVERSION

An applied occurrence of an identifier in a block is called a "free variable" or "global variable" if there is no matching binding occurrence in the block.

Note the phrase "free variable" should really be "free identifier": a free variable may in fact be a constant or any other entity denoted by an identifier.

For example this function "scale" has one free variable, namely "s", and one non-free (bound) variable, namely the local variable "x":

const s = 10.0; 
function scale (x: real): real; 
begin 
    return x*s; 
end;

Alpha-conversion is the rule that the meaning of a block is unchanged if the name of a bound variable is changed everywhere inside the block. Clearly alpha-conversion is a valid rule under static scoping.

DYNAMIC SCOPING

Suppose the function scale is used inside another procedure:

procedure update (v: vector); 
begin 
    var s := norm(v); 
    for j := 1 to length(v) do scale(v[i]) 
end;

Static scoping says that the "s" local to "update" is irrelevant to the function "scale", so the vector v will be magnified tenfold, not normalized.

"Dynamic scoping" is a different scoping discipline. It says that an applied occurrence refers to the binding occurrence in the most recently entered scope.

Under dynamic scoping, the function "update" would normalize a vector v.

Dynamic scoping violates alpha-conversion: if the name "s" in the "update" procedure body was changed to "t", under dynamic scoping a different effect is obtained.