Information Technology Reference
In-Depth Information
Instead of building the sixth table, T ( 5 ) , we observe that the regular expression that is
needed is r = r ( 5 )
i ,5 r ( 4 )
5,5 r ( 4 )
and r ( 4 )
5,5
1,1 + r ( 5 )
1,4 + r ( 5 )
1,5 .Since r ( 5 )
i , j = r ( 4 )
i , j + r ( 4 )
=
5, j
( 10 + 010 ) , we have the following expressions for r ( 5 )
1,1 , r ( 5 )
1,4 ,and r ( 5 )
1,5 :
r ( 5 )
1,1 =
r ( 5 )
1,4 = 01 +( 010 )( 10 + 010 ) ( 1 + 01 )
r ( 5 )
1,5 = 010 +( 010 )( 10 + 010 ) ( + 10 + 010 )=( 010 )( 10 + 010 )
Thus, the DFSM recognizes the language denoted by the regular expression r = + 01 +
( 010 )( 10 + 010 ) ( + 1 + 01 ) . It can be shown that this expression denotes the same language
as does + 01 +( 01 )( 01 + 001 ) ( + 0 )=( 01 + 010 ) . (See Problem 4.12 .)
4.4.3 grep —Searching for Strings in Files
Many operating systems provide a command to find strings in files. For example, the Unix
grep command prints all lines of a file containing a string specified by a regular expression.
grep is invoked as follows:
grep regular-expression file name
Thus, the command grep 'o+' file name returns each line of the file file name that
contains o + somewhere in the line. grep is typically implemented with a nondeterministic
algorithm whose behavior can be understood by considering the construction of the preceding
section.
In Section 4.4.1 we describe a procedure to construct NFSMs accepting strings denoted
by regular expressions. Each such machine starts in its initial state before processing an input
string. Since grep finds lines containing a string that starts anywhere in the lines, these NFSMs
have to be modified to implement grep . The modifications required for this purpose are
straightforward and left as an exercise for the reader. (See Problem 4.19 .)
4.5 The Pumping Lemma for FSMs
It is not surprising that some languages are not regular. In this section we provide machinery
to show this. It is given in the form of the pumping lemma, which demonstrates that if a
regular language contains long strings, it must contain an infinite set of strings of a particular
form. We show the existence of languages that do not contain strings of this form, thereby
demonstrating that they are not regular.
The pigeonhole principle is used to prove the pumping lemma. It states that if there are
n pigeonholes and n + 1 pigeons, each of which occupies a hole, then at least one hole has two
pigeons. This principle, whose proof is obvious (see Section 1.3 ), enjoys a hallowed place in
combinatorial mathematics.
The pigeonhole principle is applied as follows. We first note that if a regular language L
is infinite, it contains a string w with at least as many letters as there are states in a DFSM M
recognizing L . Including the initial state, it follows that M visits at least one more state while
processing w than it has different states. Thus, at least one state is visited at least twice. The
substring of w that causes M to move from this state back to itself can be repeated zero or
Search WWH ::




Custom Search