Bottom-Up Parsing (Compiler Writing) Part 5

Precedence Functions for Simple Precedence Grammars

The notion of precedence functions was introduced in Sec. 7-2.4. As indicated there, memory restrictions may require a reduction in the amount of space utilized by a precedence matrix, making it necessary to use precedence functions in order to reduce the storage requirements of a compiler. In this subsection, a procedure for generating precedence functions from a precedence matrix is described. A modified version of this procedure is also presented. These methods can easily be extended to operator precedence grammars as well.

If the number of vocabulary symbolstmp245-605_thumbin a language is n, then the precedence matrix requires n2 elements. However, the information stored in the matrix can sometimes be linearized", that is, two precedence functions can represent the information which is stored in a precedence matrix. For any two symbols A and B of a simple precedence grammar, the precedence functions / and g are defined as

tmp245-607_thumb


Since each function can be stored as a vector of n elements, storage requirements can be reduced from n2 to 2n elements.

Precedence matrices do not always have precedence functions associated with them. For example, the precedence matrix given in Table 7-17 does not have any

Table 7-17 Precedence matrix having no associated precedence functions

Precedence matrix having no associated precedence functions

associated precedence functions. Notice, for example, the following relations may be derived from the matrix:

tmp245-609_thumb

This results in the contradictiontmp245-610_thumbFurthermore, the precedence functions, if they exist, are not necessarily unique. Two pairs of precedence functions for the precedence matrix given in Table 7-18 are

A

B

C

/

1

6

1

g

6

3

6

A

B

C

/

1

3

1

g

3

2

3

A theorem given by Bell (1969) describes a simple method for generating a pair of precedence functions from a precedence matrix. The procedure involves representing the precedence matrix as a graph with the vertices representing the elementstmp245-612_thumbThe number of nodes  reachable from a node representing one of the elements is the value of that element.

In the following theorem, the notationtmp245-613_thumbis equivalent totmp245-614_thumbor

tmp245-615_thumbis defined to betmp245-616_thumb

Theorem 7-4 For a simple precedence grammar with the vocabulary symbols tmp245-617_thumbdefine a directed graph with 2 n nodes representing the values of the precedence functions fortmp245-618_thumb

tmp245-619_thumbDefine an edge fromtmp245-620_thumbSimilarly,

define an edge fromtmp245-621_thumbAssign a number to each node equal to the number of nodes that are reachable from that node (including the node itself).

Table 7-18 Precedence matrix

Precedence matrix

The numbers assigned totmp245-633_thumbare the numbers assigned to their corresponding nodes. Check that the functions are valid precedence functions by comparing them with the precedence matrix. If they are not valid, no precedence functions exist for this matrix.

Proof It must be shown thattmp245-634_thumbimpliestmp245-635_thumbimpliestmp245-636_thumbimpliestmp245-637_thumbFor the condition of equality, there is an arc fromtmp245-638_thumband fromtmp245-639_thumbHence any node reachable from one is also reachable from the other.

As the other two conditions are symmetric, only the conditiontmp245-640_thumb impliestmp245-641_thumb_ will be proven. There is an arc fromtmp245-642_thumb becausetmp245-643_thumbso any node accessible from Ai is accessible from Ay, hence tmp245-644_thumbIt must be shown that strict inequality holds.

Assume thattmp245-645_thumbThe nodestmp245-646_thumb are accessible from one another and there must be the path

tmp245-661_thumb

This results in the relations

tmp245-662_thumb

which implies that

tmp245-663_thumb

Unless every relation in this chain is an equality, we havetmp245-664_thumb, a contradiction. Thus we havetmp245-665_thumbbut sincetmp245-666_thumbthe definition of precedence functions has been contradicted. Either no valid precedence functions exist or the original assumptiontmp245-667_thumbis false, implying that if precedence functions do exist, thentmp245-668_thumb

In order to illustrate how this theorem is applied, we derive the precedence functions for the grammartmp245-669_thumbis defined as

tmp245-676_thumb

The precedence matrix for this grammar is given in Table 7-19. Applying the procedure outlined in the previous theorem results in the directed graph given in Fig. 7-6. Summing the number of nodes accessible from each node gives the following precedence-function values:

A

B

a

b

/

6

3

6

6

g

3

4

3

4

Table 7-19 Precedence matrix for example grammar

A

B

a

b

A

>

>

>

>

B

=

<

=

<

a

>

>

>

>

b

>

>

>

>

Comparing the precedence matrix with the precedence functions establishes the validity of these functions.

The procedure for constructing precedence functions can easily be implemented on a computer by using an adjacency matrix. An elementtmp245-677_thumbof an adjacency matrix A is true if a path of length 1 exists from node i to node j; otherwise it is false.

The graph defined in the theorem can be represented by a 2n X 2n adjacency matrix B where its rows and columns, numbered 1,2,…, 2n, represent the nodes of the graphtmp245-678_thumbThe form of the matrix is given in Table 7-20, where / is the identity matrix,tmp245-679_thumbis an n X n matrix wheretmp245-680_thumbis true iftmp245-681_thumbis the transpose of the matrix which has elementtmp245-682_thumbtrue iftmp245-683_thumbThe transitive closure of B, B+ is derived by using Warshall’s algorithm.

Directed graph for precedence matrix of Table 7-21.

Figure 7-6 Directed graph for precedence matrix of Table 7-21.

Table 7-20 Format of B matrix

Format of B matrix

The result is a path matrix where element tmp245-693_thumbis true if node j is reachable from node i by some path. The values of the elementstmp245-694_thumband so on, are determined by summing the 1 bits on the rows, with each row representing one of the elements. The number of 1 bits on a row dictates the number of nodes of the graph which are reachable from that node.

Procedure PRECEDENCE_FUNCTION(P, N, F, G). Given the square precedence matrix P of size N, where the value of 0 for element P[i,j] indicates no relation exists between symbolstmp245-695_thumbindicates thattmp245-696_thumb2 indicates thattmp245-697_thumb, and 3 indicates thattmp245-698_thumbthis procedure computes the precedence functions F and G associated with this matrix, and determines the validity of these functions. The adjacency matrix is stored in B while I, J, and K are loop variables.

tmp245-705_thumb

Table 7-21 Matrix B

"l

0

0

0

1

1

1

1

0

1

0

0

1

0

1

0

0

0

1

0

1

1

1

1

0

0

0

1

1

1

1

1

0

1

0

0

1

0

0

0

0

1

0

0

0

1

0

0

0

1

0

0

0

0

1

0

Lo

1

0

0

0

0

0

1J

 

 

 

tmp245-706_thumb

By checking the precedence matrix relations against those between the precedence functions, the algorithm determines the validity of the precedence functions.

As an example of how the algorithm generates precedence functions from a precedence matrix, we shall use the simple precedence grammar given in the previous example. The precedence matrix for this grammar is given in Table 7-19. Using this matrix as input to the algorithm, the B and B+ matrices which result are given in Tables 7-21 and 7-22, respectively. The precedence functions produced by this algorithm are:

A

B

a

b

/

6

3

6

6

g

3

4

3

4

These functions are valid.

Table 7-22 Matrix B +

1

1

0

0

1

1

1

1

0

1

0

0

1

0

1

0

0

1

1

0

1

1

1

1

0

1

0

1

1

1

1

1

0

1

0

0

1

0

1

0

0

1

0

0

1

1

1

0

0

1

0

0

1

0

1

0

0

1

0

0

1

0

1

1

In order to determine whether the precedence functions exist, the precedence functions produced by the algorithm are compared against the precedence matrix. Recall from the example of a precedence matrix which does not have any associated precedence functions that the contradiction f(B) < f(B) was derived. This can occur for a precedence matrix when a cycle exists in the directed graph representing the matrix with at least one of the arcs in the cycle representing either the relation > or the relation < . In the example given earlier, the cycle

tmp245-707_thumb

exists with the arctmp245-708_thumbrepresenting the relationtmp245-709_thumbMartin (1972) presents a modified version of Bell’s original algorithm where the existence of the precedence functions is possible only if the directed graph produced has no cycles.

Martin’s algorithm proceeds as follows: A directed graph is defined having verticestmp245-710_thumband having arcs from

tmp245-711_thumband fromtmp245-712_thumbThis can be represented in a computer by the Boolean matrix

tmp245-718_thumb

where the matrixtmp245-719_thumb

The relation = is introduced by defining the matrix

tmp245-721_thumb

here thetmp245-722_thumbotherwise. The transitive closure of C,C links all vertices of the directed graph together which are joined by the = relation. Then using Boolean multiplication, the matrix B is defined as B = C*B0. This matrix indicates all those vertices which are immediately reachable from a given vertex, that is, those nodes which can be reached by traversing one arc only. The identity matrices are included as part of the C matrix to ensure that all those vertices of the original directed graph, which are represented by B0 and are not immediate successors to other vertices, are not deleted. The transitive closure of B,B+ dictates all vertices which are reachable from a given vertex. The precedence function’s values may then be computed by summing the rows in the same fashion as was done with Bell’s method.

Bell’s (1969) precedence-function algorithm included the = chains in his B matrix and therefore directly included cycles in the matrix, with the simplest cycles being those on the diagonal. Martin’s (1972) algorithm, on the other hand, implicitly includes the = chains with the C* matrix, having the result that the only cycles being introduced are the ones which prevent the existence of precedence functions. The detection of these cycles, if they exist, can then be done by checking the transitive closure matrix (B+) for l’s on the principal diagonal; if at least one such bit is found, precedence functions do not exist for the given precedence matrix.

Transcribing the procedure just outlined as an algorithm is left as an exercise. Applying this procedure to the precedence matrix given in Table 7-19 results in the B0, C, and B+ matrices of Table 7-23. As no l’s appear on the principal diagonal of B+, precedence functions exist and are given by

A

B

a

/

5

0

5

5

g

0

1

0

1

Information is lost when precedence matrices are linearized into precedence functions because the information on whether a relation holds between two symbols is lost. If a relation does hold, we can discern what the relation is, but because precedence functions have a value for all elements, it is impossible to tell whether there actually is a relation between two symbols. A syntax error will eventually be detected, however, but this will occur when a reduction of the sentential form stored on the stack is attempted but cannot be done. For example,

Table 7-23

(a) Matrix B0

0

0

0

0

1

1

1

1

0

0

0

0

0

0

0

0

0

0

0

0

1

1

1

1

0

0

0

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

.0

1

0

0

0

0

0

0

(b) Matrix C

1

0

0

0

0

0

0

0

0

1

0

0

1

0

1

0

0

0

1

0

0

0

0

0

0

0

0

1

0

0

0

0

0

1

0

0

1

0

0

0

0

0

0

0

0

1

0

0

0

1

0

0

0

0

1

0

0

0

0

0

0

0

0

1.

(c) Matrix B+

0

1

0

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1

0

0

1

1

1

1

0

1

0

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

precedence-function values for the precedence matrix of Table 7-13 are

E

V

T

F

V

+

*

i

(

)

/

2

4

5

7

6

4

6

7

2

7

g

2

3

4

6

5

4

6

7

7

2

The corresponding grammar is given in Sec. 7-3.1. Table 7-24 illustrates the reduction of the invalid string i + + i using the precedence matrix and the precedence functions. As can be seen, the error is detected first using the precedence matrix as two symbols occur which have no relation. A further discussion of error detection and recovery is given in the next subsection.

The method presented in this section also may be applied to operator precedence grammars. The algorithms used are basically the same, with the operationtmp245-724_thumbreplacingtmp245-725_thumbreplacingtmp245-726_thumbreplacingtmp245-727_thumbThe  difference lies in the fact that the precedence relations of a simple precedence grammar can be defined between two nonterminal symbols or between a nonterminal symbol and a terminal symbol, while such relations do not exist for operator precedence grammars. Therefore, only the functional values of the terminal symbols need to be computed, and this can be done in the same fashion as with simple precedence grammars. Therefore no further discussion needs to be given on this topic.

Table 7-24

(a) Parse of i +

+ i using precedence matrix

(b) Parse of i +

+ i using

precedence functions

Input

Stack

Input

Stack

Step string

Relation

contents Handle

Step

string

Relation

contents

Handle

tmp245-732 tmp245-733 tmp245-734

0

tmp245-735 tmp245-736 tmp245-737 tmp245-738 tmp245-739
tmp245-740 tmp245-741 tmp245-742

1

tmp245-743 tmp245-744 tmp245-745 tmp245-746 tmp245-747
tmp245-748 tmp245-749 tmp245-750

2

tmp245-751 tmp245-752 tmp245-753 tmp245-754 tmp245-755
tmp245-756 tmp245-757 tmp245-758

3

tmp245-759 tmp245-760 tmp245-761 tmp245-762 tmp245-763
tmp245-764 tmp245-765 tmp245-766

4

tmp245-767 tmp245-768 tmp245-769 tmp245-770 tmp245-771
tmp245-772 tmp245-773 tmp245-774

5

tmp245-775 tmp245-776 tmp245-777 tmp245-778 tmp245-779
tmp245-780 tmp245-781 tmp245-782

6

tmp245-783 tmp245-784 tmp245-785 tmp245-786 tmp245-787
tmp245-788 tmp245-789 tmp245-790

7

tmp245-791 tmp245-792 tmp245-793 tmp245-794 tmp245-795
tmp245-796 tmp245-797 tmp245-798

8

tmp245-799 tmp245-800 tmp245-801 tmp245-802 tmp245-803
tmp245-804 tmp245-805 tmp245-806

9

tmp245-807 tmp245-808 tmp245-809 tmp245-810 tmp245-811
tmp245-812 tmp245-813 tmp245-814

10

tmp245-815 tmp245-816 tmp245-817 tmp245-818 tmp245-819

11

tmp245-820 tmp245-821 tmp245-822 tmp245-823

Next post:

Previous post: