Information Technology Reference
In-Depth Information
A simple example illustrates the construction of an NFSM from a regular grammar. Con-
sider the grammar G 4 of Example 4.9.4 . A new grammar G 4 is constructed with the following
rules: a) S
.
Figure 4.27 (page 185 ) shows an NFSM that accepts the language generated by this gram-
mar. A DFSM recognizing the same language can be obtained by invoking the construction of
Theorem 4.2.1 .
,d) A
0 A ,b) S
0 C ,c) C
1 B ,e) B
0 A ,f) B
0 D ,andg) D
4.11 Parsing Context-Free Languages
Parsing is the process of deducing those rules of a grammar G (a derivation ) that generates a
terminal string w . The first rule must have the start symbol S on the left-hand side. In this
section we give a brief introduction to the parsing of context-free languages, a topic central
to the parsing of programming languages. The reader is referred to a textbook on compilers
for more detail on this subject. (See, for example, [ 11 ] and [99].) The concepts of Boolean
matrix multiplication and transitive closure are used in this section, topics that are covered in
Chapter 6 .
Generally a string w has many derivations. This is illustrated by the context-free grammar
G 3 defined in Example 4.9.3 and described below.
EXAMPLE 4.11.1 G 3 =( N 3 , T 3 , R 3 , S ) ,where N 3 = { S , M , N } , T 3 = { A , B , C } and R 3
consists of the rules below:
a )
c MN c
d ) N
b N b
S
b ) M
a M a
e ) N
c
c ) M
c
The string caacaabcbc can be derived by applying rules ( a ) , ( b ) twice, ( c ) , ( d ) and ( e ) to
produce the following derivation:
ca 2 M a 2 N c
c MN c
ca M a N c
S
(4.2)
ca 2 ca 2 N c
ca 2 ca 2 b N bc
ca 2 ca 2 bcbc
The same string can be obtained by applying the rules in the following order: ( a ) , ( d ) , ( e ) ,
( b ) twice, and ( c ) . Both derivations are described by the parse tree of Fig. 4.28 .Inthistree
each instance of a non-terminal is rewritten using one of the rules of the grammar. The order
of the descendants of a non-terminal vertex in the parse tree is the order of the corresponding
symbols in the string obtained by replacing this non-terminal. The string ca 2 ca 2 bcbc ,the
yield of this parse tree, is the terminal string obtained by visiting the leaves of this tree in a
left-to-right order. The height of the parse tree is the number of edges on the longest path
(having the most edges) from the root (associated with the start symbol) to a terminal symbol.
A parser for a language L ( G ) is a program or machine that examines a string and produces a
derivation of the string if it is in the language and an error message if not.
Because every string generated by a context-free grammar has a derivation, it has a cor-
responding parse tree. Given a derivation, it is straightforward to convert it to a leftmost
derivation , a derivation in which the leftmost remaining non-terminal is expanded first. (A
rightmost derivation is a derivation in which the rightmost remaining non-terminal is ex-
panded first.) Such a derivation can be obtained from the parse tree by deleting all vertices
Search WWH ::




Custom Search