Review

(Note: covers CS130, Winter 2001; based on material from previous classes, in particular Fall 1999 and review of Winter 2000).

The primary motivation for a PL is to provide a good notation for problem solving.

What does a program do? A program manipulates data values.

 
               +-----------+
       input ->|  PROGRAM  |-> output
      (value)  +-----------+  (value)

So a program has following aspects:

program = data + operations + control

For example, a program can add (operation) the scores (data) of the students of a course to find the average. For this some kind of looping (control) is required.

Observation: Not all operations are valid for all values. For example, operation addition is not valid for the name of a person. So values for which a set of operations is valid are grouped together into a type:

value + operation = type

Most PLs have several builtin types: integers, reals, etc. The builtin operations are: addition, etc.

A given problem may require handling its own set of data values. So most PLs provide a facility to create problem specific data values ("structs", user-defined types, ADTs) and operations ("functions") on them.

Variables and Declarations

The fundamental constructs in imperative languages are variables and assignment. Variables have types and values associated with them. A variable can be thought of as a placeholder or a "box" in which values are stored. The type of values stored determines the type of the variable. Variables are introduced to compilers through declarations. Typical declarations are shown below:

     var age: integer; (* Pascal *)
     int age; // C, Java

A declaration makes the compiler take note of the name (age) and its type (integer). The compiler also creates a reference for the variable. The question now is whether storage is allocated for the variable or not. Some languages allocate storage at compile time whereas others allocate storage at runtime.

An issue related to declarations is that of initialization. In some languages, declared variables are automatically and appropriately initialized. Some languages (Java, for example) complain if a variable is used before it is initialized. Most of the languages provide some facility for initialization at the time of declaration.

	int x = 10; // C

A declaration is called a binding occurrence; all other occurrences called applied or bound occurrences:

float x, y;  // binding occurrence, C, Java
x, y: real;  // binding occurrence, Pascal
procedure foo(x: integer; var y: real); // binding occurrence Pascal

x = 0.7; // bound occurrence of x
y = sin(x); // bound occurrences of x and y

Expressions and Statements

Computation proceeds by manipulating values. Expressions specify computation of new values. Expressions are composed of operands (constants and variables), operators, and possibly function calls. Evaluation proceeds by substituting values for operands and by performing the operations.

The values associated with the variables in a program is called the environment of the program. Evaluation of an expression should only produce a value and not change the environment. This requirement is called referential transparency. The following C expression is not referentially transparent:

a[i++] + c // not referentially transparent

Haskell, being a pure functional language, is referentially transparent: the evaluation of functions and expressions is side-effect free; we can replace any expression with its value, e.g., in f(exp_1, ..., exp_n) and still will get the same value. Referential transparency is violated by (so-called "destructive") assignment statements to global variables within function bodies, or expressions with side-effects like i++ above.

Note that expressions are not expected to change the state of computation; rather statements or commands change the state of computation. An example of a statement is the assignment statement:

	a = b + c; // statement in C

Many PLs provide a notation to group statements, called compound statements. The following illustrates compound statements in Pascal and C.

  
	begin y := x; x := 0 end (* Pascal *) 
	{ y = x; x = 0; } // C

Block Structuring

Compound statements represent a collection of related computations. These computations might need some temporary variables. So it makes sense to allow declarations inside compound statements.

 { float temp;
   temp = x;
   x = y;
   y = temp; }

The term block refers to a construct that can contain declarations, including procedure declarations. Not all languages allow procedure declarations inside blocks (e.g., C)

Control Structures

In a typical program, values are referenced by variables and are manipulated by operators and functions. Various control structures are possible: if-then-else, while loops, for loops, recursion, and exceptions.

In real world programs based on traditional languages, recursion is not used that often. However it is a powerful abstraction technique and the way to "iterate/loop" in pure functional lanugages (and logic programming, i.e., Prolog).

Exceptions

make error checking less cluttered:

 
      try {
          for (int i=0; i <  n; i++) a[i] += b[i];
       } catch (IndexOutOfBoundsExecption e) {
              // handle exception
           throw new IncompatibleArraysException(); // pass the buck
       }

We can think of exceptions as a mechanism that provides a separate channel for error checking.

       normal code     error handling code
          |                 |
          | --- throw ----->|
          |                 |
          |                 |

Both parts are interleaved in the program text, but are distinguished by the keywords "try" and "catch".

Logic languages (Prolog) lack explicit control structures and support a declarative programming style (as opposed to procedural). Instead of "commanding" how something is to be computed (like in imperative programming), ideally, one merely specifies (i.e., states) logic facts and rules and lets an inference mechanism derive the truth of a statement:

human(socrates).         % fact
mortal(X) <- human(X).   % rule

?- mortal(X)             % query
X=socrates               % positive answer with variable binding

yes

?- mortal(tom)

no                       % negative answer: negation as failure (to prove)

Actual Prolog programs are sometimes similar to the functional style as we have seen it with Haskell. Main difference include that standard Prolog does not have a type system like Haskell, there are no functions; instead everything is modeled using predicates/relations. Still there are some similarities:

append([], Xs, Xs).
append([X|Xs], Ys, [X|XsYs]) :-
	append(Xs,Ys, XsYs).


append [] xs    = xs
append x:xs ys  = x:(append xs ys)

Note how the function result is used as an extra argument in Prolog! The Prolog system uses depth first search to find the solution based on the facts and rules provided by the programmer. (This is only partially true. The ordering of facts and rules influences search. Logic languages also have nonlogical constructs like cut, fail, etc.)

ABSTRACTION

As program size increases, there is a need for abstraction. As program development becomes a group activity, abstraction is indispensable. Different languages support abstraction to different extents.

Note that functions, procedures, and methods supported in one way or another in practically all languages, are abstractions of expressions. Other abstraction supported in most languages is the notion of a block: a group of statements which are treated as a single unit and in which variables can be defined.

Abstract data types (ADTs) are the next step in abstraction. An ADT is a collection of type declarations and associated operations (functions). ADTs can make certain aspects of their definitions "private", i.e., visible only to the implementer and not to the user of the ADT. Typically all data is hidden and access to data is achieved through functions. Formally, an ADT consists of a specification (signature + constraints), and one or more implementations (i.e., types + code).

With increasing software development costs, there is a need for software reuse. Object-oriented programming has two key features which aid software reuse: inheritance and polymorphism. Inheritance enables addition of new functionality while reusing of basic functionality. Polymorphism refers to identifying the object type at runtime and using the appropriate method. Polymorphism requires runtime support. Consider the following example.

class Shape {   // old code
  void draw() { /* ... */ }
}

class GraphicsEngine {   // old code
   void render(Shape s) {
     s.draw();
   }
}

class Circle extends Shape { // new code
   void draw() { /* ... */ } // new functionality
}

Circle c;
g.render(c); // old code can be reused

CONCURRENCY

There is an increasing need for programs to handle many tasks at a time. Threads implement one form of concurrency. When threads cooperate to in problem solving, they should share data. This sharing can be done either through shared variables or through message passing.

If threads communicate through shared variable, mutual exclusion of accesses is required to ensure consistency of its value.

Another form of communication between threads is message passing. Message passing can be used even if the threads are running on different machines. The overhead associated with message passing is the copying of values from the source to the destination. RMI of Java can be thought of as a form of message passing.

Scoping of Names and Binding

static scoping: free variable refers to the closest enclosing declaration local variables can be renamed without affecting the result.
dynamic scoping: free variable refers to the most recent binding renaming of local variables might change the result.

Variable Lifetimes

global
persistent, static
local (automatic)
dynamic

int a; // global
void foo() {
  static int count; // persistent
  float f, *a; // local
  a = new float; // dynamic

  f = a; // free variable
}

Runtime System

stack allocation
static and dynamic links for nested procedure declarations
static link is needed implement static scope rule

Activation Record When a subprogram/procedure/function/method is called, an activation record is created on the runtime stack:

|      /\        |
|      ||        |
|----------------|
| Local Variables|
|----------------|
| Parameters     |
|----------------|
| Dynamic Link   | 
|                | 
|----------------|
| Static Link    | 
|----------------|
| Return Address | 
|----------------|

return address: so we know where to resume program execution
static link: points to static parent (= enclosing procedure) => access of nonlocal variables for static scoping
dynamic link: points to dynamic parent (= caller) => access of nonlocal variables; crucial for dynamic scoping; but also to remove current activation record instance upon completion

Parameter Passing

call by value: good for functions
call by reference: good for procedures, for passing huge objects like arrays
call by name (call by macro): delayed evaluation of parameters, useful for "if-then-else" type of functions; a bit similar to lazy evaluation in Haskell
call by value-result: making "in", "out", "in out" parameters explicit and using "copy in/copy out" semantics
unification: In Prolog, arguments to predicates do not have the input-output distinction. Consider the rule "dessert(X) :- sweet(X)". The predicate "dessert" can be "called" with both a constant like "icecream" or a variable, like "Z". In the first case, the new goal to be proved is "sweet(icecream)". In the second case, the new goal to be proved is "sweet(Z)". When a constant like "icecream" is passed, it is passed by reference. Modification of values is not possible.

====================================================================

PARADIGMS

Using the above ideas, programming can be organized in different ways. Consider the example where on models the number of persons in a room. If a new person enters a room, the systems should update its state to account for this fact.

IMPERATIVE: Computation proceeds by manipulating values in fixed locations ("boxes") and storing the new values back in the original locations.
In imperative programming, one defines a variable called "numPersons". If a new person enters the room, this variable is incremented, as in "numPersons++".

FUNCTIONAL: Computation proceeds by creating new values and locations to store them. So the value held by a location is never changed.
Ideally, there are no variables. There is an expression for the number of persons already present in the room. We create a new expression which adds 1 to this and returns the incremented value. A pure functional program might look like (here: ML style):

         fun addPerson(e) = (plus 1 e);

where "e" is a recursive data type and "plus" is the tag. A value in this representation looks like

 
        (plus 1 (plus 1 (plus 1 (plus 1 0))))

which reflects the fact that initially there was no one and four people entered, one at a time. (This expression can be evaluated to 4 at any time.) Note that old values are not destroyed; only new values are created.

Pure functional style is as elegant but can be inefficient. So modern languages retain only the spirit of functional programming. An important feature of functional languages is that functions are also values. So it is possible to return functions as results, etc. So functions are said to be first class.

LOGIC: Logic programming is like functional programming in the sense that values are only created, never destroyed. For example, there is a predicate or relation called "addPerson(X, Y)" where "X" is the number of persons before and "Y" is the number of persons after an entry. The definition of this relation looks like

      addPerson(X, Y) :- Y = X+1;

The fact that this is a relation and not a function can be seen from the following behavior.

If we use "addPerson(3, Y)", "Y" gets a value 4.
If we use "addPerson(3, 4)", the evaluation is "true".
If we use "addPerson(3, 5)", the evaluation is "false".
If we use "addPerson(X, 5)", "X" gets the value 4.

So the same relation is used by the system differently in different contexts.

OBJECT-ORIENTED: In OOP, the values are autonomous. Computation proceeds by sending messages to values asking them to change their state.

Here there is an object called "numPersons" and if a new person enters the room, a message is sent to it, like "numPersons.addPerson()". Then the object changes its state in response to the message.

Again this is pure OOP: there is no central control. But most of the practical OOPLs have the notion of a central control flow. There is also the notion of inheritance. The following lists some of the key ideas in these paradigms.

IMPERATIVE (C, Pascal)
      changing the values of variables
FUNCTIONAL (Haskell, ML)
      expression evaluations producing new values
      recursion
      higher order functions
      lazy evaluation (not supported by ML)
OOP (Java)
      data abstraction
      autonomous objects, constructors, destructors
      message passing (not implemented in Java for normal methods)
      inheritance
      polymorphism
LOGIC (Prolog)
      declarative program, implicit control (DFS and backtracking)
      relations: no input-output distinction
      unification (pattern matching)
      variable bindings to express constraints/relations between predicates
           dessert(X) :- sweet(X), calorie(X, Y), Y < 2.
          /* X is a dessert is X is sweet and if its calorie
             content is less than 2 */
SCRIPTING (Perl)
      no type declarations
      runtime type checking
      high-level constructs like pattern matching, associative arrays

Java is a language designed from the scratch to support OOP. The core language is simple. The main disadvantage of Java now is its slow execution speed.

New languages evolve in response to new applications. Applications which might inspire development of new paradigms in the near future are: networking, security, and multimedia.

Further issues to remember:

Types 
.. primitive, composite, type constructors vs data constructors in Haskell

strong typing 
... = (a) compile-time type checking, or (b) type errors can be detected
e.g. as in Haskell and Java, 

polymorphism

name and structural equivalence/compatibility

static and dynamic type-checking

overloading

coercion

inheritance and overriding






    


Last modified: Mon Mar 19 18:34:51 PST 2001