CSE 130 Lecture Notes

CSE130 LECTURE NOTES

March 12th, 2001

MISCELLANEOUS

there will be a revision section (likely Monday or Tuesday evening ~ 7pm...)

OBJECT-ORIENTED PROGRAMMING

(cf. [Sebesta, Ch.11]) ...

Updatable variables and assignment commands are the essence of imperative programming. There are two basic reasons for using variables:

to remember intermediate results, and
to keep track of things changing in the outside world.

The first use can be avoided; the second use of updatable variables is much more fundamental ("Temporary variables are inessential").

WHAT IS AN OBJECT?

...

An object contains data, called its state, private variables or memory, and
it has operations that can update its private variables called methods. Methods are sometimes called member functions.
A method of an object is invoked by sending a message to the object. A message is a request for action together with parameters necessary to complete the task.
Therefore, instead of writing x := plus(x,y) we write, e.g.,
```
	
	 send x "add y"
```
or simply
```
	x.add(y)
```
This calls the add method of the object named x. Every method has an unwritten (implicit) first argument, which is the object itself.
The set of messages that an object understands is called the message protocol or message interface of the object.

CLASSES

In the broadest sense, a class is a set of similar objects. Often a type is associated with a class (or a class is considered to be a type itself).

We say that an object is an instance of a class. The operations of an object typically have side-effects, so they are not pure functions.

The class is the repository for behavior associated with an object, i.e., all objects that are instances of the same class can perform the same actions.

Classes are organized into a tree called the class hierarchy (or inheritance hierarchy). Memory and behavior associated with instances of a class are automatically available to any descendent class (= direct or indirect subclass).

Thus, in the OO paradigm, instead of writing procedures that work on data structures (as in traditional imperative programming), operations (methods) and data are "glued together" a viewed as a unit providing some computation service.

CLASSES COMPARED TO ABSTRACT DATA TYPES

The definition of a class looks similar to the implementation of an ADT, but no separate signature or specification is given:

class financial-history = 
begin 
    state 
	cash, receipts, expenses 
    method init() = 
	cash := 0; receipts := []; expenses := []; 
    method receive(amount) = 
	receipts := append(amount,receipts); 
	cash := cash + amount; 
    method spend(amount) = 
	expenses := append(amount, expenses); 
	cash := cash - amount; 
    method cash() = return cash; 
end;

Individual objects can be declared in the same way as variables:

var myaccount: financial-history;

A class introduces a collection of operations which are imperative. Methods modify objects which have an internal, updatable state.

In contrast, an ADT introduces a new type. Any operation on a value of the new type must take the value as an input parameter. With a pure ADT, there are no operations that modify values of the type. Instead, some operations can generate "fresh", new values of the type.

RESTRICTED IMPERATIVE UPDATING

A type is pure if variables having the type cannot be updated in any way, i.e. no assignment commands are available for this type. In functional programming, all types are pure.

Classes are typically not pure, because objects can be updated, i.e. modified. However programming with objects is still safer than programming with regular composite types, because updating can only be done through predefined methods. ...

INHERITANCE

From a programming language (as opposed to programming methodology) point of view, the major innovation in OOP is inheritance. Here is an example of what is called a derived class definition:

class tax-history parent financial-history = 
begin 
    state deductions; 
    method init() = 
	deductions := []; 
    method deductible-spend(amount) = 
        deductions := append(amount,deductions); 
        self.spend(amount);	// = send self spend(amount)
end;

Every object of type tax-history has three financial-history state variables, plus one new state variable. It can respond to all financial-history messages, and some additional messages.

Thus, instead of duplicating the effort and copying the state variables and methods from financial-history for the class tax-history, we instead inherit them (since they apply for tax-history as well) and thereby reuse the code of the parent class!

Q: What are the advantages of this approach?

A DESIGN PROBLEM

If class P is the parent of class C, we can say C is-a P (a tax-history is-a financial history). In this way, C inherits properties and methods from P. Alternatively, we could have defined that tax-history objects have financial-history "sub-objects". This approach is called object composition.

Here is another example showing the distinction in Java:

class Rectangle {
// "state", "instance variables", "fields":
  private int height, width;
  String desc = "a Rectangle";
  ...
// constructors
Rectangle() {
  height = 10;
  width = 10; 
  ...
}

Rectangle(int height, int width, ...) throws IllegalValueException {
  if (height < 0) throw new IllegalValueException();
  ...
  this.height = height;
  this.width = width;
...
}

//methods 
public int getArea() {
  return height*width
}
public void draw () {
...}

...
}

Now assume you want to define a class ColorRect having also some color. Under the object composition approach, you could write:

class ColorRect {
  public Rectangle rect;
  public Color color;
...
}

Observe that now every ColorRect object has-a Rectangle object as part of it (hence this is also called part-of relationship).

In contrast, using inheritance we would write

class ColorRectangle extends Rectangle {
  Color color;         // not private/protected/public => same package
  String desc = "Colored Rectangle"; // "shadows" the parent's field
  ...
 // constructor; called after the Rectangle constructor
  ColorRectangle() {
    color = new Color();
  }
 // methods
 ... inherited from parent or overridden:
  public void draw() {
  ... // now the colorful stuff
}
}

In the case of inheritance (ColorRectangle) we can reuse the method e.g. getArea() from Rectangle; in the case of ColorRect we have to address the subobject:

... 
  ColorRectangle cr = new ColorRectangle();
  ColorRect cr1 = new ColorRect();
...
  cr.getArea() ; // OK
  cr1.getArea() ; // ERROR
  cr1.rect.getArea() ; // OK

Whether to use object composition (has-a) or inheritance (is-a) is not always a trivial decision...

JAVA INTERFACES vs. ADTs

We have seen that Java interfaces do not provide implementations but only a specification/signature. A class that implements an interface will hide its variables. Thus interfaces and classes implementing them correspond to the "export signature" and implementation of ADTs, respectively.

MULTIPLE INHERITANCE

If a class Z has two parents X and Y both defining a method with the same name M, we speak of a multiple inheritance conflict (in Java a class can extend only one parent; but interfaces can also introduce naming conflicts).

Ways to deal with multiple inheritance include disallowing it in the first place, resolving the naming conflicts, or giving some parent class priority.

DYNAMIC BINDING

With inheritance there is a problem concerning how methods are bound to method-names.

One method of an object can call another of the same object if necessary. The name self (or this) refers to the current object, which is the implicit first argument of each method.

Calls to a self-method should refer to the method with the given name in the child class. This allows child class methods to override parent methods.

For example, sending the message id to an object of class D with the following class definitions should give I'm a type D object:

class C = 
begin 
    method whoami = 
	return "I'm a type C object"; 
    method id = 
	print (self.whoami) 
end; 

class D parent C = 
begin 
    method whoami = 
	return "I'm a type D object"; 
end;

Under standard static scoping, the name whoami in the body of the method C.id would refer to the method C.whoami. This is not what we want.

Method names should follow dynamic binding: the method corresponding to a name should be found in the runtime environment, not in the compile-time environment.

A powerful feature of dynamic binding is that for polymorphic types, the (runtime) object type (and sometimes even together with the types of the arguments) is used to invoke the correct method.

POLYMORPHISM IN OOP

If a message say draw() for the Rectangle example above is sent to an object, it will look up (at runtime!) first its own class to determine whether it "understands" the message. If not, the message is passed to the superclass(es)/parent(s) until an ancestor is found that understands it.

Thus, since subclasses can override methods of parent classes, polymorphism (of methods/behavior) is achieved: the program code can call/send a message and the correct method is determined at runtime, depending on the actual type (class membership) of the object to which the message is sent.

SOME MORE OOP KEYWORDS

Object identity. Even if two objects have the same internal value, they are still different objects. Similarly, two variables are different even if temporarily they have the same value, because they have different locations, and hence in the future they can be updated differently.
Object identity is connected to an "object-centric" view of programming: every operation is about some primary object. Syntactically, every method has an unwritten (implicit) first argument, which is the object itself.
Encapsulation. The state of an object is private to the object: the only way the state can be changed or observed is by calling one of the object's methods. This is what we call "implementation-independence" or "information-hiding".
Overloading. Names of operations/methods can be overloaded, i.e., different methods are denoted by the same name. The actual method is determined based on the number and types of arguments (e.g., in most languages the operation "+" is overloaded and can denote integer or real addition, depending on the argument types).
Note that the "host object" to which a message is sent can be seen as the first argument of a function: Thus
```
	obj.method(arg_1, arg_2, ..., arg_n)
```
corresponds to
```
	method(obj, arg_1, arg_2, ..., arg_n)
```
In this way, the class of the host object can also be considered when selecting which of the overloaded methods is being called!
Overriding. This means that more specific methods from subclasses override methods which are inherited from superclasses.

STATIC (OWN) VARIABLES REVISITED

Another point of view on objects is that an object is a single procedure with some variables that are static and private, i.e. they keep their values between calls to the procedure but they can only be updated inside the procedure. The methods of the object are alternative instructions that the object can execute.

According to this view, the object itself is really a dispatcher: it waits for a message and then selects one of its methods for execution based on which message was received. A class is a function that can generate new objects. For example:

type messages = (next, restart); 
function newgenerator (initial:real) = 
begin 
    var seed: real := initial; 
    return 
        function (m: messages) = 
        begin 
            case m of 
                next: seed := transform(seed); 
                restart: seed := initial; 
            end; 
            return seed; 
        end; 
end; 

function myrandom = newgenerator(1.2345); 
function hisrandom = newgenerator(0.1428);

Here newgenerator is a higher-order function that returns another function, which is anonymous. Each returned function is a pseudo-random number generator (PRNG) that can respond to two alternative messages.

The updated PRNG seed is kept in the variable seed which is local to newgenerator so seed is fresh each time that newgenerator is called. However seed is global to the anonymous function so each version of seed keeps its value between calls to the corresponding individual PRNG.

EXCEPTION HANDLING

Let's think about the design of "escape" commands. In general a condition that makes an escape necessary is called an "exception". When an exception is triggered, and control escapes from inside a command, usually some recovery commands should be executed. Here are some desirable features of exception handling:

The single entry, single exit property should be preserved: that control can only flow into the command at one point, and can only flow out at one point.
It should be possible to raise exceptions and recover from them for any command, not just loops. There can be reasons to abort any command.
When an exception is handled, it should be possible to execute some recovery commands as an alternative to the regular command which gave rise to the failure. For example if an exception is raised while reading an input value, it should be possible to recover by assigning a default value.
It should be possible to provide exception handlers for many different types of exception.
Providing exception handlers should be optional. If a handler is not provided, the exception should cause an escape from the enclosing block.
Programmer-defined exceptions like "negative age" should be allowed in addition to system-defined exceptions like "division by zero".

The Ada language [Ada-FAQ] has a successful exception mechanism that meets all these requirements, as do other languages including Java. The mechanisms are very similar, so we'll just look at the Ada syntax. In general exception handlers occur at the end of blocks:

begin 
    C; 
exception 
    when e_1 => C_1; 
    when e_2 => C_2; 
    ... 
end;

If the exception e_i is raised while executing the command C, then C is abandoned and C_i is executed instead.

If a recovery command is not provided for an exception, then the exception is propagated to the next enclosing block. For example:

begin 
    begin 
        C; 
    exception 
        when e1 => C1; 
    end; 
exception 
    when e1 => C1'; 
    when e2 => C2; 
    ... 
end;

In the code above, C1' will never be executed, but C2 may be.

A REALISTIC EXAMPLE OF EXCEPTION-HANDLING

Here is an example of an Ada procedure where three exceptions may be raised, and how those exceptions are handled. Note that negative_rainfall is a user-defined and user-raised exception, and that end_error is handled non-locally.

procedure main is 
    type 
        month is (jan, feb, mar, apr, may, jun, jul, aug, sep, oct,
    nov, dec); 
        rainfall: array (month) of float; 
    negative_rainfall: exception; 

    procedure input_data is 
    begin 
        for amonth in month 
            loop 
                begin 
                    get(rainfall(amonth)); 
                    if rainfall(amonth) < 0.0 then raise negative_rainfall 
                exception 
                    when data_error => rainfall(amonth) := 0.0; 
                    when negative_rainfall => rainfall(amonth) := 0.0; 
    
                end; 
            end loop; 
    end; 

begin 
    ... 
exception 
    when end_error => put("Insufficient data"); 
    when others => put("Unknown catastrophic error"); 
end

THE "EIFFEL" WAY TO USE EXCEPTIONS

It is bad programming style to use exceptions in place of regular testing for alternative cases, e.g. for end-of-file, or for leap years in software that processes dates.

The language Eiffel enforces a rule of good programming style, that the only ways to leave an exception-handler are

to execute the failed operation again, or
to propagate the exception.

With re-execution (called retry), after raising e1 and executing C1, C would be executed again. With propagation (called reraise), after raising e1 and executing C1, C1' would then be executed.

The use of exceptions in the Ada example above does not meet the Eiffel criteria.

In general, there are two main opinions on programming style in the presence of errors. The first attitude is that code should be written defensively, and as many errors as possible should be handled, even if they cannot be fixed completely.
The second attitude is that errors should be revealed as quickly as possible, so that they can be fixed properly.

For example, according to the first attitude, one should write

    
for (i = 0; i <= 10; i++) { ... }

According to the second attitude one should write

 
for (i = 0; i != 10; i++) { ... }

Suppose that the code inside the loop erroneously modifies i to have a value greater than 10. The first approach will hide this error, while the second will reveal it to the user, by causing an infinite loop.