A value is a piece of data that is involved in computing. A type is a collection of values on which the same operations make sense.
For example:
Example of different operations on integers and reals:
A built-in type (or operation) is one that is supplied by the language. The opposite of built-in is programmer-defined. Typically, integers, reals, and characters are built-in, and maybe booleans and strings. For types that are not built-in, a PL often has a standard way to make them programmer-defined.
Different languages often use different names for the same primitive type. This is just a "syntactic" difference.
Sometimes there are minor semantic differences, e.g. maybe -7 mod 3 = 2 but -7 rem 3 = -1.
An implementational issue is a semantic question that is left unspecified by the definition of a PL. Implementation issues are choices that different compilers or interpreters can make differently. The most troublesome PL differences are often implementational. For example, are floating point numbers 32 bit or 64 bit?
Implementational differences can be very troublesome because they are not well-documented (like syntactic differences) and they do not have work-arounds (like documented semantic differences).
Ideally, nothing in the specification of a language should be implementation-dependent.
For example in Pascal you can write type direction = {n, e, s, w}
Note that n, e, s, and w are not characters. They are identifiers (i.e. names) which introduce brand-new values that previously did not exist in the language.
Remember, a type is a set of values on which the same operations make sense. Which operations are provided for values of type direction? Pascal provides the built-in functions succ, pred, and ord. For example:
succ(s) = w
pred(s) = e
ord(s) = 2
Using enumeration types makes programs clearer, and helps to catch errors at compile-time. For example writing s*w by mistake gives a compiler error message.
In earlier versions of C, enumeration types are not available. Instead, it's good style to write
#define n 0
#define e 1
#define s 2
#define w 3
However this doesn't allow catching errors at compile-time. Conclusion: it's good for a language to be designed so that the compiler can enforce type correctness rules.
Enumeration types don't substitute for the programmer's good sense.
He or she must still choose to only use the values of the type in sensible
ways, for example to enumerate directions in a sensible order such as clockwise.
With an "abstract data type" (we'll study these later), the programmer
defining the type can specify what the type-specific sensible operations
are, and can disallow all other operations.
There are big differences between programming languages in how a user can define his own types of compound values. Many differences are only syntactic (Pascal records are called structures in C) but still, we need some concepts that let us understand the commonalities.type point = record x, y: real end;
Our aim is to classify the wide variety of compound types. We will do
this by looking at the mathematical ways of constructing sets of compound
values. In ordinary mathematics we know about intersections, unions,
and Cartesian products. For programming we need some slightly more
complex concepts.
Mathematically, the Cartesian product of sets A and B is the set of all possible ordered pairs <x,y> where the first component x of a pair is from A and the second component y is from B. Formally, the Cartesian product is written A x B and defined to be the set { <x,y> : x in A, y in B }.
If A has n elements, i.e. its cardinality is n, and the cardinality of B is n, then the cardinality of A x B is mn. Formally, |A x B| = |A| |B|.
Records in Pascal or structures in C are Cartesian product types. In records and structures components are given names. The benefit of these names is mostly documentation, since the components could be referred to by their numerical position: first, second, third, etc.
In ML product types are defined without giving names to components, in a more mathematical way. For example:
type person = string * string * int * real;
In Haskell the same product type is defined as follows:
type person = (String, String, Integer, Real)
NOTE. Be careful not to confuse this Haskell notation that corresponds to a Cartesian product with a similar notation in Pascal for an enumerated type!! For example, in Pascal, the type declaration
type color = (red,green,blue);means that a variable of type color has one of the three values. The enumerated type is declared in Haskell as follows:
data Color = Red | Green | BlueNote that a tuple is the same as a row in a relational database, where components are often called "fields" or "attributes".
What is the set of values of the type "field"? Given a value of type direction, the array can return a value of type real. The array is a mapping from type "direction" to type "real".type direction = (n, e, s, w);
field = array [direction] of real;
Mathematically, a mapping from set A to set B is a set of ordered pairs where
(1) the first item in each pair is from A and the second item
is from B
(2) for each item x in A, there is a unique y
in B such that the pair <x,y> is in the mapping.
The mathematical notation for an ordered pair is <x,y> using "angle brackets" or (x,y) using parentheses. The mathematical notation for the set of all mappings from one set to another is ->. For example, cosine is a member of R -> R where R is the set of real numbers.
The cardinality of A -> B is |B| to the power |A|. Intuitively, to specify a single mapping we must make |B| choices, |A| times.
Example. How many mappings are there from three colors to an intensity level say 0...255?
type colormapping = {red, green, blue} -> {0,1,...,255}In other words, how many values can a variable of type colormapping assume? Red can be mapped to one of 256 values, and so can (independently) green and blue. Hence there are 256 * 256 * 256 mappings, each representing a value that a variable of type colormapping can have.
Mappings are also called functions. The set on the left of
the arrow (the "from" set) is called the domain and the set on
the right is called the range. Note that the domain of a
mapping need not be a primitive type. If we think of an array as a
mapping, then the domain type is called the index type and is
often a tuple.
Be sure to understand the difference
between a single mapping, such as cosine, and the set of
all mappings of the same type, such as R -> R.