Let us look at object-oriented programming from the point of view of trying to understand it in terms of programming language concepts that we know already. We will compare OOP to abstract data types. However first we will look again at imperative versus functional programming:
Updatable variables and assignment commands are the essence of imperative programming. There are two basic reasons for using variables: (1) to remember intermediate results (2) to keep track of things changing in the outside world.
The second use of updatable variables is much more fundamental. If you think of a software module as simulating something in the real world, e.g. a database system simulates the population of UCSD students, then it is very natural to update variables as events happen in the real world, e.g. a student enrolls in a class.
In object-oriented programming, an object is a generalization of a variable as used to model something in the outside world that can change.
function fact (n: integer): integer; var temp: integer; begin if n = 0 then temp := 1 else begin temp := n-1; temp := fact(temp); temp := n*temp end; return temp end;Is using updatable variables to remember intermediate results really necessary? The answer is no, because using nested expressions, the programmer can let the compiler take care of computing intermediate results in the right order:
function fact (n: integer): integer; begin return ifthenelse( n=0, 1, n*fact(n-1) ) end;Temporary variables are inessential.
If you think of a software module as simulating something in the real world, e.g. a database system simulates the population of UCSD students, then it is very natural to update variables as events happen in the real world, e.g. a student adds a class.
In object-oriented programming, an object is a generalization of a variable as used to model an entity in the outside world.
We can think of an object as a (complex) variable that has "implementation-independence". The idea of "representation-independence" is a special case of "implementation-independence", which is an idea that can apply to types, to variables, constants, functions, procedures, and more:
In modeling a real-world entity, what is important is that the model should behave like the entity being modelled, not the details of how the model is implemented.
Therefore, instead of writing x := plus(x,y) we write, e.g.,
send x "add y"or simply
x.add(y)This calls the add method of the object named x. Every method has an unwritten (implicit) first argument, which is the object itself.
We say that an object is an instance of a class. The operations of an object typically have side-effects, so they are not pure functions.
The class is the repository for behavior associated with an object, i.e., all objects that are instances of the same class can perform the same actions.
Classes are organized into a tree called the class hierarchy (or inheritance hierarchy). Memory and behavior associated with instances of a class are automatically available to any descendent class (= direct or indirect subclass).
Thus, in the OO paradigm, instead of writing procedures that work on data structures (as in traditional imperative programming), operations (methods) and data are "glued together" a viewed as a unit providing some computation service.
class financial-history = begin state cash, receipts, expenses method init() = cash := 0; receipts := []; expenses := []; method receive(amount) = receipts := append(amount,receipts); cash := cash + amount; method spend(amount) = expenses := append(amount, expenses); cash := cash - amount; method cash() = return cash; end;Individual objects can be declared in the same way as variables:
var myaccount: financial-history;
A class introduces a collection of operations which are imperative. Methods modify objects which have an internal, updatable state.
In contrast, an ADT introduces a new type. Any operation on a value of the new type must take the value as an input parameter. With a pure ADT, there are no operations that modify values of the type. Instead, some operations can generate "fresh", new values of the type.
Errors are very likely with unlimited direct updating of composite variables. Consider the following example:
const length = 5; type address = array [1..length] of string; procedure update (a: address); var j: integer := 1; s: string; begin while (j <= length) and not eof(input) do begin readln(s); a[j] := s; j := j+1 end end;Q: what happens if the user gives an address with less than 5 lines?
Classes are typically not pure, because objects can be updated, i.e. modified. However programming with objects is still safer than programming with regular composite types, because updating can only be done through predefined methods.
class tax-history parent financial-history = begin state deductions; method init() = deductions := []; method deductible-spend(amount) = deductions := append(amount,deductions); self.spend(amount); // = send self spend(amount) end;Every object of type tax-history has three financial-history state variables, plus one new state variable. It can respond to all financial-history messages, and some additional messages.
One method of an object can call another of the same object if necessary. The name self (or this)refers to the current object, which is the implicit first argument of each method.
Calls to a self-method should refer to the method with the given name in the child class. This allows child class methods to override parent methods.
For example, sending the message id to an object of class D with the following class definitions should give I'm a type D object:
class C = begin method whoami = return "I'm a type C object"; method id = print (self.whoami) end; class D parent C = begin method whoami = return "I'm a type D object"; end;
Under standard static scoping, the name whoami in the body of the method C.id would refer to the method C.whoami. This is not what we want.
Method names should follow dynamic binding: the method corresponding to a name should be found in the runtime environment, not in the compile-time environment.
A powerful feature of dynamic binding is that for polymorphic types, the (runtime) object type (and sometimes even together with the types of the arguments) is used to invoke the correct method.
Object identity is connected to an "object-centric" view of programming: every operation is about some primary object. Syntactically, every method has an unwritten (implicit) first argument, which is the object itself.
According to this view, the object itself is really a dispatcher: it waits for a message and then selects one of its methods for execution based on which message was received. A class is a function that can generate new objects. For example:
type messages = (next, restart); function newgenerator (initial:real) = begin var seed: real := initial; return function (m: messages) = begin case m of next: seed := transform(seed); restart: seed := initial; end; return seed; end; end; function myrandom = newgenerator(1.2345); function hisrandom = newgenerator(0.1428);
Here newgenerator is a higher-order function that returns another function, which is anonymous. Each returned function is a pseudo-random number generator (PRNG) that can respond to two alternative messages.
The updated PRNG seed is kept in the variable seed which is local to newgenerator so seed is fresh each time that newgenerator is called. However seed is global to the anonymous function so each version of seed keeps its value between calls to the corresponding individual PRNG.