Before we look closer at ADTs, let us revisit type declarations in Haskell:
-- Enumeration type:
data Color = Red | Green | Blue
-- Union ("sum") type:
data MyAddress = ShortAddr City | LongAddr Street City
type City = String
type Street = String
-- Record/Tuple ("product") type (polymorphic):
data MyTriple a = Triple a a [a]
-- Algebraic ("sum of products") type:
data BinTree = EmptyBT | LeafBT Int | NodeBT BinTree BinTree
-- Polymorphic Algebraic type:
data XTree a = LeafXT a | NodeXT [ XTree a]
Some observations:
myBinTree0 = NodeBT (NodeBT (LeafBT 2) (LeafBT 3)) (LeafBT 4) myXTree0 = NodeXT [ NodeXT [LeafXT 1], LeafXT 2]How can we have the system "print" values for these data types? For example, we can explicitly tell the system to show us the values using the (predefined) function show :: Show a => a -> String:
Main> show myBinTree0 "NodeBT (NodeBT (LeafBT 2) (LeafBT 3)) (LeafBT 4)" :: [Char] Main> show myXTree0 "NodeXT [NodeXT [LeafXT 1],LeafXT 2]" :: [Char]Yet another way is to define your own show function, or to declare:
data XTree a = LeafXT a | NodeXT [ XTree a] deriving Showwhich then will allow us to just enter an XTree expression directly (the "deriving Show" declaration makes XTree a an instance of the type class Show):
Main> myXTree0 NodeXT [NodeXT [LeafXT 1],LeafXT 2] :: XTree Integer
The above data types, e.g., myTriple a or myBinTree a are not ADTs, because we can see (and access!) their internal representation! For example, we may just enter:
Main> Triple 1 2 [3,4] Triple 1 2 [3,4] :: MyTriple Integer (71 reductions, 134 cells)(here we assumed a deriving Show declaration for MyTriple a was given)
Note how we used the data constructor Triple directly to define a concrete value. The system evaluates the expression by reducing it to its canonical (i.e., normal) form, which in this case is the expression itself, and then shows its value.
Data Abstraction, Information Hiding, Encapsulation.
The notion of data abstraction (see also Chapter 10,
[Sebesta,4th edition]) means that we want to abstract from the
concrete representation of a type. Instead, we hide the details
of the concrete data representation, together with the implementation
of the operations and encapsulate them in a module
(e.g., Haskell, Pascal) or class (e.g. Java).
In this way, an abstract data type can only be accessed and manipulated via the operations that are declared for this ADT; these operations act as an interface to the ADT.
More formally: An ADT has two parts: a specification part and an implementation part.
The implementation part of an ADT is private: only the implementer needs to know the details of exactly how values are represented, and how the basic operations on values are accomplished.
Note: the notions of user-defined/system-defined vs. abstract/concrete data types are orthogonal (independent).
Q: What kind of data type is Float?
-- SPECIFICATION of STACK (signatures) create :: Stack a -- create an empty stack push :: a -> Stack a -> Stack a -- push an element on the stack pop :: Stack a -> Stack a -- pop an element off the stack top :: Stack a -> a -- what's on top? empty :: Stack a -> Bool -- is the stack empty?What are the constraints that apply to our stack?
top ( push x s) = x pop ( push x s) = s top (create) ==> error pop (create) ==> errorNote that we have specified a stack without giving its concrete implementation. In this way, if we access an ADT only through its "interface" operations push, pop, ..., the code becomes independent of the chosen implementation. Advantages of this approach include:
-- IMPLEMENTATION: CONCRETE DATA TYPE
data Stack a = EmptyStack | Push a (Stack a)
deriving (Eq, Show) -- for displaying stacks
-- IMPLEMENTATION: "code"
create = EmptyStack
-- note the difference between the ADT operation
-- ... "push" (a Haskell function) and the
-- ... data constructor "Push" :
push x s = Push x s
pop (Push _ s) = s
pop EmptyStack = error "*** ERROR: pop from empty stack!"
top (Push x _) = x
top EmptyStack = error "*** ERROR: top of empty stack!"
empty EmptyStack = True
empty (Push _ _) = False
Q: Can you give another implementation of Stack a? How
about using lists?
Most concrete types are pure. For example, no operation on integers actually modifies an integer. Instead, all the operations like "+" produce fresh outputs.
An updatable ADT is one where some operations actually change values of the ADT. For example, suppose we had an operation called "pop" that took a stack as argument and modified it ("in place", "destructively") by removing the highest-priority item. This operation would be impure and the whole ADT would then be impure also.
A package has a public signature and a private implementation, like an ADT. The difference is that a package can simultaneously provide many types, many specific values, and many operations. This increases flexibility and allows mutually dependent types but loses the analogy with concrete data types (recall that an ADT is really a (albeit abstract) data type while a package can be a collection of ADTs together with other non-ADT "features".)
Packages preserve the crucial ADT idea of information-hiding. Details that a user does not need to know are not visible to him.
Packages are best-known in Ada (but also provided e.g. in ML). Haskell provides a mechanism call module for defining ADTs. One important use for packages is as units of separate compilation. All the types in a package typically provide related functionality and are implemented by the same person or team, so it is reasonable for the classes in a package to have access to each other's implementation, i.e. to all the state variables and methods of each other.