Scanning—Theory and Practice - Crafting a Compiler - page 97

Java Reference

In-Depth Information

b

a

b

1,2

3,4,5

4,5

a

a | b

5

Figure 3.25: DFA created for NFA of Figure 3.24.

tokens do not exhibit this problem when they are made deterministic. As a

rule, DFAs used for scanning are simple and compact.

When creating a DFA is impractical (either because of speed-of-generation

or size concerns), an alternative is to scan using an NFA (see Exercise 17). Each

possible path through an NFA can be tracked, and reachable accepting states

can be identified. Scanning is slower using this approach, so it is usually used

only when the construction of a DFA is not cost-e

ff

ective.

3.8.3 Optimizing Finite Automata

We can improve the DFA created byM

. Sometimes this DFA

will have more states than necessary. For every DFA, there is a unique smallest

(in terms of number of states) equivalent DFA. Suppose a DFA D has 75 states

and there is a DFA D with 50 states that accepts exactly the same set of strings.

Suppose further that no DFA with fewer than 50 states is equivalent to D .

Then D is the only DFA with 50 states equivalent to D . Using the techniques

discussed in this section, we can optimize D by replacing it with D .

Some DFAs contain unreachable states , states that cannot be reached from

the start state. Other DFAs may contain dead states , states that cannot reach

any accepting state. It is clear that neither unreachable states nor dead states

can participate in scanning any valid token. So we eliminate all such states as

part of our optimization process.

We optimize the resulting DFA by merging states we know to be equiva-

lent. For example, two accepting states that have no transitions out of them

are equivalent. Why? Because they behave exactly the same way—they accept

the string read so far but will accept no additional characters. If two states, s 1

and s 2 , are equivalent, then all transitions to s 2 can be replacedwith transitions

to s 1 .Ine

ake

D

eterministic

ff

ect, the two states are merged into one common state.

Next Page

Crafting a Compiler

Search WWH ::

Custom Search

Home