Information Technology Reference
In-Depth Information
A simple example illustrates the construction of an NFSM from a regular grammar. Con-
sider the grammar
G
4
of Example
4.9.4
. A new grammar
G
4
is constructed with the following
rules: a)
S
.
Figure
4.27
(page
185
) shows an NFSM that accepts the language generated by this gram-
mar. A DFSM recognizing the same language can be obtained by invoking the construction of
Theorem
4.2.1
.
→
→
→
,d)
A
→
→
→
→
0
A
,b)
S
0
C
,c)
C
1
B
,e)
B
0
A
,f)
B
0
D
,andg)
D
4.11 Parsing Context-Free Languages
Parsing
is the process of deducing those rules of a grammar
G
(a
derivation
) that generates a
terminal string
w
. The first rule must have the start symbol
S
on the left-hand side. In this
section we give a brief introduction to the parsing of context-free languages, a topic central
to the parsing of programming languages. The reader is referred to a textbook on compilers
for more detail on this subject. (See, for example, [
11
] and [99].) The concepts of Boolean
matrix multiplication and transitive closure are used in this section, topics that are covered in
Chapter
6
.
Generally a string
w
has many derivations. This is illustrated by the context-free grammar
G
3
defined in Example
4.9.3
and described below.
EXAMPLE
4.11.1
G
3
=(
N
3
,
T
3
,
R
3
,
S
)
,where
N
3
=
{
S
,
M
,
N
}
,
T
3
=
{
A
,
B
,
C
}
and
R
3
consists of the rules below:
a
)
→
c
MN
c
d
)
N
→
b
N
b
S
b
)
M
→
a
M
a
e
)
N
→
c
c
)
M
→
c
The string
caacaabcbc
can be derived by applying rules
(
a
)
,
(
b
)
twice,
(
c
)
,
(
d
)
and
(
e
)
to
produce the following derivation:
ca
2
M
a
2
N
c
⇒
c
MN
c
⇒
ca
M
a
N
c
⇒
S
(4.2)
ca
2
ca
2
N
c
ca
2
ca
2
b
N
bc
ca
2
ca
2
bcbc
⇒
⇒
⇒
The same string can be obtained by applying the rules in the following order:
(
a
)
,
(
d
)
,
(
e
)
,
(
b
)
twice, and
(
c
)
. Both derivations are described by the
parse tree
of Fig.
4.28
.Inthistree
each instance of a non-terminal is rewritten using one of the rules of the grammar. The order
of the descendants of a non-terminal vertex in the parse tree is the order of the corresponding
symbols in the string obtained by replacing this non-terminal. The string
ca
2
ca
2
bcbc
,the
yield
of this parse tree, is the terminal string obtained by visiting the leaves of this tree in a
left-to-right order. The
height
of the parse tree is the number of edges on the longest path
(having the most edges) from the root (associated with the start symbol) to a terminal symbol.
A
parser
for a language
L
(
G
)
is a program or machine that examines a string and produces a
derivation of the string if it is in the language and an error message if not.
Because every string generated by a context-free grammar has a derivation, it has a cor-
responding parse tree. Given a derivation, it is straightforward to convert it to a
leftmost
derivation
, a derivation in which the leftmost remaining non-terminal is expanded first. (A
rightmost derivation
is a derivation in which the rightmost remaining non-terminal is ex-
panded first.) Such a derivation can be obtained from the parse tree by deleting all vertices
Search WWH ::
Custom Search