Information Technology Reference
In-Depth Information
EXAMPLE
4.9.3
Consider the grammar
G
3
with the following rules and the implied terminal and
non-terminal alphabets:
a
)
→
c
M
c
N
c
d
)
N
→
b
N
b
S
b
)
M
→
a
M
a
e
)
N
→
c
c
)
M
→
c
ca
n
ca
n
cb
m
cb
m
c
G
3
is context-free and generates the language
L
(
G
3
)=
{
|
n
,
m
≥
0
}
,asis
easily shown.
Context-free languages capture important aspects of many programming languages. As
a consequence, the parsing of context-free languages is an important step in the parsing of
programming languages. This topic is discussed in Section
4.11
.
4.9.4 Regular Languages
DEFINITION
4.9.4
A
regular grammar
G
is a context-free grammar
G
=(
N
,
T
,
R
,
S
)
,where
the right-hand side is either a terminal or a terminal followed by a non-terminal. That is, its rules
are of the form
A
→
a
or
A
→
b
C
. The languages defined by regular grammars are called
regular
languages
.
→
a
Some authors define a regular grammar to be one whose rules are of the form
A
b
k
C
. It is straightforward to show that any language generated by such a
grammar can be generated by a grammar of the type defined above.
The following grammar is regular.
EXAMPLE
4.9.4
Consider the grammar
G
4
=(
N
4
,
T
4
,
R
4
,
S
)
where
N
4
=
{
S
,
A
,
B
}
,
T
4
=
{
→
b
1
b
2
···
or
A
}
R
4
consists of the rules given below.
0,1
and
a
)
→
d
)
→
0
A
0
A
S
B
b
)
→
e
)
→
0
0
S
B
c
)
→
1
B
A
→
→
→
→
01
B
generate the same strings as the rules given above. Thus, the language
G
4
contains the strings
0, 010, 01010, 0101010,
...
, that is, strings of the form
(
01
)
k
0for
k
It is straightforward to see that the rules a)
S
0, b)
S
01
B
,c)
B
0, and d)
B
≥
0. Consequently
L
(
G
4
)=(
01
)
∗
0. A formal proof of this result is left to the reader. (See Problem
4.44
.)
4.10 Regular Language Recognition
As explained in Section
4.1
, a deterministic finite-state machine (DFSM)
M
is a five-tuple
M
=(Σ
,
Q
,
δ
,
s
,
F
)
,where
Σ
is the input alphabet,
Q
is the set of states,
δ
:
Q
Q
is
the next-state function,
s
is the initial state, and
F
is the set of final states. A nondeterministic
FSM (NFSM) is similarly defined except that
δ
is a next-set function
δ
:
Q
×
Σ
→
2
Q
.In
other words, in an NFSM there may be more than one next state for a given state and input.
In Section
4.2
we showed that the languages recognized by these two machine types are the
same.
We now show that the languages
L
(
G
)
and
L
(
G
)
×
Σ
→
∪{
}
defined by regular grammars
G
are exactly those recognized by FSMs.
Search WWH ::
Custom Search