Lexical Analysis - Introduction to Compiler Construction in a Java World

Java Reference

In-Depth Information

Definition 2.4. A non-deterministic finite-state automaton (NFA) is a finite state automa-

ton that allows either of the following conditions.

More than one move from the same state, on the same input symbol, that is,

m(r;a) = s;

m(r;a) = t; for states r;s and t where s 6= t:

An -move dened on the empty string , that is,

m(r;) = s;

which says we can move from state r to state s without scanning any input symbols.

An example of a deterministic nite-state automaton is N = (;S;s 0 ;M;F), where

= fa;bg;S = f0; 1; 2g;s 0 = 0;M = fm(0;a) = 1;m(0;b) = 1;m(1;a) = 1;m(1;b) =

1;m(1;) = 0;m(1;b) = 2g;F = f2g and is illustrated by the diagram in Figure 2.8. This

NFA recognizes all strings of a's and b's that begin with an a and end with a b. Like any

FSA, an NFA is said to recognize an input string if, starting in the start state, there exists

a set of moves based on the input that takes us into one of the final states.

But this automaton is definitely not deterministic. Being in state 1 and seeing b, we can

go either back into state 1 or into state 2. Moreover, the automaton has an -move.

FIGURE 2.8 An NFA.

Needless to say, a lexical analyzer based on a non-deterministic finite state automaton

requires backtracking, where one based on a deterministic finite-state automaton does not.

One might ask why we are at all interested in NFA. Our only interest in non-deterministic

finite-state automata is that they are an intermediate step from regular expressions to

deterministic finite-state automata.

2.6 Regular Expressions to NFA

Given any regular expression R, we can construct a non-deterministic finite state automaton

N that recognizes the same language; that is, L(N) = L(R). We show that this is true by

using what is called Thompson's construction:

1. If the regular expression r takes the form of an input symbol, a, then the NFA that

recognizes it has two states: a start state and a final state, and a move on symbol a

from the start state to the final state.

Search WWH ::

Custom Search

Home