In Depth Tutorials and Information

Bottom-Up Parsing (Compiler Writing) Part 5

Precedence Functions for Simple Precedence Grammars

The notion of precedence functions was introduced in Sec. 7-2.4. As indicated there, memory restrictions may require a reduction in the amount of space utilized by a precedence matrix, making it necessary to use precedence functions in order to reduce the storage requirements of a compiler. In this subsection, a procedure for generating precedence functions from a precedence matrix is described. A modified version of this procedure is also presented. These methods can easily be extended to operator precedence grammars as well.

If the number of vocabulary symbolsin a language is n, then the precedence matrix requires n2 elements. However, the information stored in the matrix can sometimes be linearized", that is, two precedence functions can represent the information which is stored in a precedence matrix. For any two symbols A and B of a simple precedence grammar, the precedence functions / and g are defined as

Since each function can be stored as a vector of n elements, storage requirements can be reduced from n2 to 2n elements.

Precedence matrices do not always have precedence functions associated with them. For example, the precedence matrix given in Table 7-17 does not have any

Table 7-17 Precedence matrix having no associated precedence functions

associated precedence functions. Notice, for example, the following relations may be derived from the matrix:

This results in the contradictionFurthermore, the precedence functions, if they exist, are not necessarily unique. Two pairs of precedence functions for the precedence matrix given in Table 7-18 are

	A	B	C
/	1	6	1
g	6	3	6

	A	B	C
/	1	3	1
g	3	2	3

A theorem given by Bell (1969) describes a simple method for generating a pair of precedence functions from a precedence matrix. The procedure involves representing the precedence matrix as a graph with the vertices representing the elementsThe number of nodes reachable from a node representing one of the elements is the value of that element.

In the following theorem, the notationis equivalent toor

is defined to be

Theorem 7-4 For a simple precedence grammar with the vocabulary symbols define a directed graph with 2 n nodes representing the values of the precedence functions for

Define an edge fromSimilarly,

define an edge fromAssign a number to each node equal to the number of nodes that are reachable from that node (including the node itself).

Table 7-18 Precedence matrix

The numbers assigned toare the numbers assigned to their corresponding nodes. Check that the functions are valid precedence functions by comparing them with the precedence matrix. If they are not valid, no precedence functions exist for this matrix.

Proof It must be shown thatimpliesimpliesimpliesFor the condition of equality, there is an arc fromand fromHence any node reachable from one is also reachable from the other.

As the other two conditions are symmetric, only the condition implies_ will be proven. There is an arc from becauseso any node accessible from Ai is accessible from Ay, hence It must be shown that strict inequality holds.

Assume thatThe nodes are accessible from one another and there must be the path

This results in the relations

which implies that

Unless every relation in this chain is an equality, we have, a contradiction. Thus we havebut sincethe definition of precedence functions has been contradicted. Either no valid precedence functions exist or the original assumptionis false, implying that if precedence functions do exist, then

In order to illustrate how this theorem is applied, we derive the precedence functions for the grammaris defined as

The precedence matrix for this grammar is given in Table 7-19. Applying the procedure outlined in the previous theorem results in the directed graph given in Fig. 7-6. Summing the number of nodes accessible from each node gives the following precedence-function values:

	A	B	a	b
/	6	3	6	6
g	3	4	3	4

Table 7-19 Precedence matrix for example grammar

	A	B	a	b
A	>	>	>	>
B	=	<	=	<
a	>	>	>	>
b	>	>	>	>

Comparing the precedence matrix with the precedence functions establishes the validity of these functions.

The procedure for constructing precedence functions can easily be implemented on a computer by using an adjacency matrix. An elementof an adjacency matrix A is true if a path of length 1 exists from node i to node j; otherwise it is false.

The graph defined in the theorem can be represented by a 2n X 2n adjacency matrix B where its rows and columns, numbered 1,2,…, 2n, represent the nodes of the graphThe form of the matrix is given in Table 7-20, where / is the identity matrix,is an n X n matrix whereis true ifis the transpose of the matrix which has elementtrue ifThe transitive closure of B, B+ is derived by using Warshall’s algorithm.

Figure 7-6 Directed graph for precedence matrix of Table 7-21.

Table 7-20 Format of B matrix

The result is a path matrix where element is true if node j is reachable from node i by some path. The values of the elementsand so on, are determined by summing the 1 bits on the rows, with each row representing one of the elements. The number of 1 bits on a row dictates the number of nodes of the graph which are reachable from that node.

Procedure PRECEDENCE_FUNCTION(P, N, F, G). Given the square precedence matrix P of size N, where the value of 0 for element P[i,j] indicates no relation exists between symbolsindicates that2 indicates that, and 3 indicates thatthis procedure computes the precedence functions F and G associated with this matrix, and determines the validity of these functions. The adjacency matrix is stored in B while I, J, and K are loop variables.

Table 7-21 Matrix B

"l	0	0	0	1	1	1	1
0	1	0	0	1	0	1	0
0	0	1	0	1	1	1	1
0	0	0	1	1	1	1	1
0	1	0	0	1	0	0	0
0	1	0	0	0	1	0	0
0	1	0	0	0	0	1	0
Lo	1	0	0	0	0	0	1J

By checking the precedence matrix relations against those between the precedence functions, the algorithm determines the validity of the precedence functions.

As an example of how the algorithm generates precedence functions from a precedence matrix, we shall use the simple precedence grammar given in the previous example. The precedence matrix for this grammar is given in Table 7-19. Using this matrix as input to the algorithm, the B and B+ matrices which result are given in Tables 7-21 and 7-22, respectively. The precedence functions produced by this algorithm are:

	A	B	a	b
/	6	3	6	6
g	3	4	3	4

These functions are valid.

Table 7-22 Matrix B ⁺


1	1	0	0	1	1	1	1
0	1	0	0	1	0	1	0
0	1	1	0	1	1	1	1
0	1	0	1	1	1	1	1
0	1	0	0	1	0	1	0
0	1	0	0	1	1	1	0
0	1	0	0	1	0	1	0
0	1	0	0	1	0	1	1

In order to determine whether the precedence functions exist, the precedence functions produced by the algorithm are compared against the precedence matrix. Recall from the example of a precedence matrix which does not have any associated precedence functions that the contradiction f(B) < f(B) was derived. This can occur for a precedence matrix when a cycle exists in the directed graph representing the matrix with at least one of the arcs in the cycle representing either the relation > or the relation < . In the example given earlier, the cycle

exists with the arcrepresenting the relationMartin (1972) presents a modified version of Bell’s original algorithm where the existence of the precedence functions is possible only if the directed graph produced has no cycles.

Martin’s algorithm proceeds as follows: A directed graph is defined having verticesand having arcs from

and fromThis can be represented in a computer by the Boolean matrix

where the matrix

The relation = is introduced by defining the matrix

here theotherwise. The transitive closure of C,C links all vertices of the directed graph together which are joined by the = relation. Then using Boolean multiplication, the matrix B is defined as B = C*B0. This matrix indicates all those vertices which are immediately reachable from a given vertex, that is, those nodes which can be reached by traversing one arc only. The identity matrices are included as part of the C matrix to ensure that all those vertices of the original directed graph, which are represented by B0 and are not immediate successors to other vertices, are not deleted. The transitive closure of B,B+ dictates all vertices which are reachable from a given vertex. The precedence function’s values may then be computed by summing the rows in the same fashion as was done with Bell’s method.

Bell’s (1969) precedence-function algorithm included the = chains in his B matrix and therefore directly included cycles in the matrix, with the simplest cycles being those on the diagonal. Martin’s (1972) algorithm, on the other hand, implicitly includes the = chains with the C* matrix, having the result that the only cycles being introduced are the ones which prevent the existence of precedence functions. The detection of these cycles, if they exist, can then be done by checking the transitive closure matrix (B+) for l’s on the principal diagonal; if at least one such bit is found, precedence functions do not exist for the given precedence matrix.

Transcribing the procedure just outlined as an algorithm is left as an exercise. Applying this procedure to the precedence matrix given in Table 7-19 results in the B0, C, and B+ matrices of Table 7-23. As no l’s appear on the principal diagonal of B+, precedence functions exist and are given by

	A	B	a
/	5	0	5	5
g	0	1	0	1

Information is lost when precedence matrices are linearized into precedence functions because the information on whether a relation holds between two symbols is lost. If a relation does hold, we can discern what the relation is, but because precedence functions have a value for all elements, it is impossible to tell whether there actually is a relation between two symbols. A syntax error will eventually be detected, however, but this will occur when a reduction of the sentential form stored on the stack is attempted but cannot be done. For example,

Table 7-23

(a) Matrix B₀

0	0	1	1	1	1
0	0	0	0	0	0
0	0	1	1	1	1
0	0	1	1	1	1
0	0	0	0	0	0
0	1	0	0	0	0
0	0	0	0	0	0
.0	1	0	0	0	0

(b) Matrix C

1	0	0	0	0	0	0	0
0	1	0	0	1	0	1	0
0	0	1	0	0	0	0	0
0	0	0	1	0	0	0	0
0	1	0	0	1	0	0	0
0	0	0	0	0	1	0	0
0	1	0	0	0	0	1	0
0	0	0	0	0	0	0	1.

(c) Matrix B⁺

1	1	1	1	1
0	0	0	0	0
1	1	1	1	1
1	1	1	1	1
0	0	0	0	0
1	0	0	0	0
0	0	0	0	0
1	0	0	0	0

precedence-function values for the precedence matrix of Table 7-13 are

	E	V	T	F	V	+	*	i	(	)
/	2	4	5	7	6	4	6	7	2	7
g	2	3	4	6	5	4	6	7	7	2

The corresponding grammar is given in Sec. 7-3.1. Table 7-24 illustrates the reduction of the invalid string i + + i using the precedence matrix and the precedence functions. As can be seen, the error is detected first using the precedence matrix as two symbols occur which have no relation. A further discussion of error detection and recovery is given in the next subsection.

The method presented in this section also may be applied to operator precedence grammars. The algorithms used are basically the same, with the operationreplacingreplacingreplacingThe difference lies in the fact that the precedence relations of a simple precedence grammar can be defined between two nonterminal symbols or between a nonterminal symbol and a terminal symbol, while such relations do not exist for operator precedence grammars. Therefore, only the functional values of the terminal symbols need to be computed, and this can be done in the same fashion as with simple precedence grammars. Therefore no further discussion needs to be given on this topic.

Table 7-24

(a) Parse of i +	+ i using precedence matrix		(b) Parse of i +		+ i using	precedence functions
Input		Stack		Input		Stack
Step string	Relation	contents Handle	Step	string	Relation	contents	Handle
			0
			1
			2
			3
			4
			5
			6
			7
			8
			9
			10
			11