CSE130 LECTURE NOTES

February 7th, 2001

MISCELLANEOUS



SYNTAX IN GENERAL

Syntax as a field of study in (programming) linguistics is about the appearance and structure of expressions. (We'll use the word expression to include any self-contained fragment of a program, for example a subroutine definition.)

A grammar specifies (1) conceptual structure and (2) lexical detail. We are interested in understanding and representing (1), which is called abstract syntax because it leaves out the irrelevant details (2).

There can be many paths to meaning (semantics), and there can be many opinions on how expressions are structured, even if there is agreement on which expressions are legal.

There is less disagreement for programming languages than for natural languages, because PLs are designed, and they have unique meanings.

The structure of an expression is (almost always) a tree (e.g., an operator tree). The key structural relationship is what is a subexpression (i.e. child) of what.

Usually the number of children is fixed and their order is significant, but there are exceptions (e.g., some binary operators like "+" can be considered polyadic operators that work on lists of arguments).
Q: Can you think of other examples?

More Examples/Questions:

Parsing follows a grammar to check and discard the lexical detail of an expression, while discovering its abstract syntactic structure.

A Haskell DATATYPE IS AN ABSTRACT SYNTAX NOTATION

There are many ways to notate abstract syntax. The most important are tree diagrams, and fully parenthesized expressions.

Consider, for example, typical syntax for simple arithmetic expressions like (1*2)+(3+4). We can represent the set of all such expressions as a recursive type.

In Haskell we just use data constructors like Plus, Times, etc. for constructing the arithmetic expression, which then become instances of a type Expr a of arithmetic expressions over some type a (recall that Expr is a type constructor):

data Expr a = 
  	Constant a  
	| Plus (Expr a) (Expr a) 
        | Minus (Expr a) (Expr a) 
	| Times (Expr a) (Expr a) 
	deriving Show                   -- so we can show them easily
Remember that you can declare a pointer-based type, say in C, with exactly the same conceptual organization! With the data structure defined above, it is easy to write a recursive function that evaluates expressions:
eval                :: Num a => Expr a -> a   -- a should be a "Num" type
                                              -- ... for this to work  
eval (Constant x)   = x
eval (Plus x y)	    = (eval x) + (eval y)
eval (Minus x y)    = (eval x) - (eval y)
eval (Times x y)    = (eval x) * (eval y)
Notice how this function follows the structure of the Expr datatype exactly!

Example: let's give it a try ...

myexpr0 :: Expr Integer                -- over Integer values
myexpr0 = 
	Plus 
		(Times 
			(Constant 1)
			(Constant 2))
		(Plus
			(Constant 3)
			(Constant 4))
Now we can have the system show the value of myexpr0 and evaluate it using our function eval:
Main> myexpr0
Plus (Times (Constant 1) (Constant 2)) (Plus (Constant 3) (Constant 4)) :: Expr Integer
(223 reductions, 514 cells)
Let's have the system infer the type for this expression:
myexpr1 =
	Plus
		(Times 
			(Constant 1.5)
			(Constant 2))
		(Plus
			(Constant 3)
			(Constant 4))
Note that after the Haskell system has parsed the expression and done the type inference (i.e., at compile-time), the type of myexpr1 is known (you have probably already seen evidence that the system does type checking at compile-time: when you try to load a Haskell file and the system reports type errors!!)

Since the above loads/compiles without errors, we can ask the system for the type of myexpr1 or just have it "print" the value of myexpr1 (here, "hugs +t +s" was used):

Main> :info myexpr1
myexpr1 :: Expr Double

Main> myexpr1
Plus (Times (Constant 1.5) (Constant 2.0)) (Plus (Constant 3.0) (Constant 4.0)) :: Expr Double
(223 reductions, 522 cells)
Using our eval function above, we can evaluate myexpr0 and myexpr1:
Main> eval myexpr0
9 :: Integer
(21 reductions, 27 cells)

Main> eval myexpr1
10.0 :: Double
(21 reductions, 27 cells)

USES OF ABSTRACT SYNTAX

Structural Induction is a mathematical proof concept which is a generalization of the Principle of Mathematical Induction over the integers: